
CERTIFICATION OF TRANSLATION 



The undersigned, Jessica T. Abreu, whose address is 2334 N. Van Buren Court, Arlington, VA 22205- 
1939, USA, declares and states as follows: 

I am well acquainted with the English and French languages; I have in the past translated numerous 
French documents oflegal and/or technical content into English; and I am certified by Georgetown Un.vers.ty and 
accredited by the American Translators Association in French into English translation; 

1 have been asked to provide a translation key for a document entitled: Sequencing List Aventis Pharma, 
Institut National de la Sante et de la Recherche M&icale, Compounds Capable of Modulating the Activity of 
Parkin, Nucleotide Sequences and Uses. 

I hereby declare that the attached translation of the document referenced above is, to the best of my 
knowledge and ability, a true and accurate translation of the original French document. 

And I declare further that all statements made herein of my own knowledge are true, that all statements 
made on information and belief are believed to be true, and that falsification of these statements andthe like ,s 
plTshle by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code. 

I therefore attach my Certification of Translation to the English translation of this document. 



October 29. 2001 



Date 



r ..^ ^ 

Cityjc ounty o f Xi^vv^W 
Commonwealth of Virgina 

The foregoing instrument was subscribed and sworn to before me on this 
(WVday of lMv?- 2001b y 

Notary Public ,j Divina A. Rutherford 

Notary Public 

My commission expires on: Coimnm we alth of Virginia 

My Commission Expires Sept. 30, 2004 




CERTIFICATION OF TRANSLATION 

1939 USA, declares and states as follows: 

Regulating the Activity of Parkin. 

I herebv declare that the attached translation of the document referenced above is, to the best of my 
^ledge^^^^ 

^declare further ^ all statemen^ 
made on information a,db^^^ 
punishable by fine or imprisonment, or both, under section iuu 



I therefore attach my Certification of Translation to the English translatio 



fVtoher 3. 2001 



Date 




document. 



hreu 




°~ a ^£!2Zim subscribed and s.orn to before me on this 



The foregoing mstrui 
j^^ dayof^jL 



200 lby 




Divina A. Rutherford 
Notary Public 

My commission expires on: Cum ino w e a lth of Virginia 

My Commission Expires Sept. 30, 2004 




FIG.1 




LD 





£• Sr Sr ^ 



□ 
'•4 

m 
i,n 
in 
«P 

u 

a 

□ 




! 

E 





CO 


.8- 






CD 










Ls 











<N 



i'f? 



CD 



pq 

EX3 
E-i 

i — i 



i— l 

g 



PQ 



pq 

C-> 



CM 

pq 

CO 

I— I 
E-i 



i 



. — i ' — 1 



1 .11 



PQ 

g 

Pq 



pq 
E-i 

g 

I — I 

§ 



CD 
E— i 



CD i — 
E-i ro 
rt3 cnj 



CO 



Oe5 
O 
O 

Pq 



i<5 



g 

Ph 



CO 

PQ 



CD 



=3 
*4-l 



« — i m 



g 



CD r-- 
rf3 eg 



o 
o 

pq 



Lylllb-fullA : the transcript 

AATGGAAGGGCGTGAGCGCTTGGTCCATGCAGTGAAGCTCTTCCAACCTGGGTCAACGAAAACG 

GAGAAGAJ^TGGCCCMGAAATAGATCTGAGTGCTCTCMGGAGTTAGAACGCGAGGC^ 

CCAGGTCCTGTACCGAGACCAGGCGGTTCAAAACACAGAGGAGGAGAGGACACGGAAACTGAAA 

ACACACCTGCAGCATCTCCGGTGGAAAGGAGCGAAGAACACGGACTGGGAGCACAAAGAGAAGT 

GCTGTGCGCGCTGCCAGCAGGTGCTGGGGTTCCTGCTGCACCGGGGCGCCGTGTGCCGGGGCTG 

CAGCCACCGCGTGTGTGCCCAGTGCCGAGTGTTCCTGAGGGGGACCCATGCCTGGAAGTGCACG 

GTGTGCTTCGAGGACAGGMTGTCAAMTAAAAACTGGAGAATGGTTCTATGAGGAACGAGCCA 

Laaatttccmctggaggcamcatgagacagttggagggcagctcttgcaatcttatcagaa 

GCTGAGCAAAATTTCTGTGGTTCCTCCTACTCCACCTCCTGTCAGCGAGAGCCAGTGCAGCCGC 
&C'rrrTRRf!AGG TTACAGGAATTTGGTCAGTTTAGAGGATTTAATAAGT CCGTGGAAAATTTGT 
TTPTfZTf'TrT TGPTACCCACGTGAAAAAGCTCTCCAAATCCCAGAATG ATATGACTTCTGAGAA 
GCATCTTCTCGCCArr,GGCCCCAGGCA r,TGTGTGGGACAGACAGAGAGACGGAGCCAGTCTGAC 
arTr,rr,r,Tr.AAC GTCACCACCAGG AAGGTCAGTGCA CCAGATATTCTGAAACCTCTCAATCAAG 

AGGATCCCAAATGCTCTACTAACC CT^^ 

CAGTACCATATTCTCTGGAGGTTTTAGACACGGAAGTTTAATTAGCATTGACAGCACCTGTACA 
GAGATGGGCAATTTTGACAATGCTAATGTCACTGGAGAMTAGM 

TCAAAACCCATTCTTTAGAAATATGCATCAAGGCCTGTAAGAACCTTGCCTATGGAGAAGAAAA 
GMGAAAAAGTGCAATCCG^TGTGMG 

CGCAAGACTGGAGKCAMGGAACACCGTGGACCCGACCTTTCAGGAGA 

Sgg^otgwcS 

CCGGAGAGTGTTTCTTGGAGAAGTGATCATTCCTCTGGCCACGTGGGACTTTGAAGACAGCACA 
ACACAGTCCTTCCGCTGGCATCCGCTCCGGGCCAAGGCGGAGAMTACGMGACAGCGTTCCTC 

AGAGTAATGGAGAGCTCACAGTC 
AGAGGCTCAAGAAGGGACAGATCAGCCATCACT 

GCCMGAAmA^CTGT^ 

TGCCAGACCAACAAAAACTGAGACTGAAGTCGCCAGTCCTGAGGAAGCAGGCTTGCCCCCAGTG 

GMACACTCATT^TCAGTGG 

ACTGTCTGGGATCAGGCCCTCTTTGGAATGAA^ 

CAAAGGGAGACACAGCTGTTGGCGGGGATGCATGCTCA 

COT?CCAGCCCCAATCTM 
GTTCCAGGTTGCAGCAGGCGTGAGG 



pLylllb-fullA : the protein 
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Lylllb-fullB : the transcript 
£tCTC^G(^ 

arArA^TGCTGTGCGCGCTGCCAGCAGGTG 
CCCGGGGCTGCAGcS 

SgtgcacgS^ 

TGAGGAACGAGCCAAGAAAT^ 

rAG^CCAGTGCAGCCGCAGTCCTGGCAGGAAGGTCAGTGCACCAGATATTCTGAAACCTCT 
CMTCMGAGGATC^ 

f^CG^ACCCAGTACCATATTCTCTGGAGGTTTTAGACACGGMGTTTAATTAGCATTGAC 
ArCACCTGTACAGAGATGGGCAA 

ca5tcaS5gcto 

CCTATGGAGAAGAAAAGMGAAAMGTC^ 

ArATCCTCCCAGGGAMGCGCAAGACTGGAGTCCAAAGGAACACCGTGGACCCGACCTTTCA 

ggagacotgmgtS^^ 

TGTGGCATCTGGGCACGCTGGCCCGGAGAGTGTTTCTTGGAGMGTGATCATTCCTCTGG 

ACGTGGGACTTTGAAGACAGCAC 
GGAGAMTACGAAGACAG^ 

Jatggtcaactttgtt^ 

cttgmcItcatttgttaagggctgtctca 

cgccagtcctgaggaagcaggcttgccccca^^^ 

A^AGC^S^ 

AATGAACGACCGCTTGCTTGGAGGMCCAGACTTGG 

^GATOCATGCTCACAATCGAAGCTCCAGTGGCAGAAAGTCCTTTCCAGCCCCAA 

acagIcatga^ 

TGAGG 



pLylllb-fullB : the protein 
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Lylllb-fullB : the transcript 

GGCCTTGGGGCACTGAGGGATGCCAGXICIG^ 

GCGTTGCCCCTGCTGGCATAGTCAGGTACCAGCCCAGCCAGGTATTGAACGGGCTGAGCTTTTCATGA 

tggttcctgctgacctggaaacatcttaaatggaagggcgtgagcgcttggtccatgcagtgaagctc 
ttccmcctgggtcmcgaamcggagmgaaSgcccmgamtagatctgagtgctctcaaggag 

TTAGMCGCGAGGCCATTCTCCAGGTCCTGTACCGAGACCAGGCGGTTCAAMCACAGAGGAGGAGAG 

GACACGGAMCTGAAAACACACCTGCAGCATCTCCGGTC^AGGAGCGM^ 

ACAMGCGMGTGCTGTGCGCGCTGCCAGCAGGTGCTGGGGTTCCTGCTGCACCGGGGCGCCGTGTGC 

CGGCGCTGCAGCCACCGCGTGTGTGCCCAGTGCCGAGTGTTCCTGAGGGGGACCCATGCCTGGMCT 

CACGGTGTGCTTCGAGGACAGCAATGTCAAAATAAAAACTGGAGAATGGTTCTATGAGGAACGAGCCA 

AGAMTTTCCMCTGGAGGCAAACATGAGACAGTTGGAGGGCAGCTCTTGCAATCTTATCAG^ 

AGCAAMTTTCTGTGGTTCCTCCTACTCCACCTCCTGTC^GCGAGAGCCAGTGCAGCCGCAGTCCTGG 

panCTTar&c naATTTPrTrArTTTAGAGGA TTTAATAAGTCCGTGGAAAATTTGTTTCTGTCTCTTG 

P^l ^ 

CAGGMGGTCAGTGCACCAGATATTCTGAMCCTCTCMTCMGAGGATCCCA^ 

pt a tttt ca AflP A Af AG A ATCTCCCATCC AGTCCGGCACCCAGTACCATATTCTCTGGAGGTTTTAGA 

TGGAGAMTAGMTTTGCCATTCATTATTGCTTCAAAACCCATTCTTTAGAMTATGCATCAA^ 
GTMGMCCTTGCCTAT(£AGMGAAAA^^ 

CCC^CAGATCCTCCCAGGGAMGCGCMGACTGGAGTCCAAAGGMCACCGTGGACCCGACCOT 

GGAGACCTTGAAGTATCAGGTGGCCCCTGCCCAGCTGGTGACCCGGCAGCTGCAGGTCTCGGTGTGGC 

ATCTGGGCACGCTGGCCCGGAGAGTGTTTCTTGGAGAAGTGATCATTCCTCTGGCCACGTGGGACTTT 

GAAGACAGCACMCACAGTCCTTCCGCTGGCATCCGCTCCGGGCCAAGGCGGAGAM 

CGTTCCTCAGAGTAATGGAGAGCTCACAGTCCGGGCTMGATGGTTCTCCCTTCACGGC^ 

TCCAAGAGGCTCMGMGGGACAGATCAGCCATCACTO 

GCCMGMTTTACCTGTGCGGCCAGATGGCACCTTGAACTCATTTGTTAAGGC 

AnarrAAPAAAA APTfl AP, APTGA AGTCGCCAGTCCTGAGGAAGCAGGCTTGCCCCCAGTGGAAACACT 

CATTTGTCTTCAGTGGCGTAACCCCAGCTCAGCTGAGGCAGTCGAGCTTGGAGTTAACTGTCTGGGAT 

PAnPTPPT PTTTP^AATGAACGACGGCTTGC TTGGAGGMCCAGACTTGGTTCAMGGGAGACACAGC 

TGTTGGCG GGGATGCATGCTCACMTCGMG CTCCAGTGGCAGAAAGTCCTTTCCAGCCC^ 

GGACAGACATGACTCTTGTCCTGCACTGACATGMGGCCTCM 

GCACTGTGCGTCTGCAGAGGGGCTACGMCCAGGTGCAGGGTCCCAGCTGGAGACCCCT^ 

AGCAGTCTCCATCTGCGGCCCTGTCCCATGGCTTMCCGCCTATTGGTATCTGTGTAT^^ 

MCj^TATGTTACCTMG^^^ 

ttIS^cgttgttacccatgaaaaaaaaaaaaa 

: the protein 

GEVIIP^CTDFEDSTTQSFRWHPL^KYEDSVPQSNGELTVm 
SLHGOLCLmS 

lrqgsi^l™^ 
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: the transcript 

GAAATCATGCCCC TCGTAGAGCAGCAGGTCCAAGCAGGGCTGCTGGCTATTTTTC CAAAAAG 
TGAGGCAGTTTAAAAAAAAGGCGGAGAACTAGAATTATAGAATAATGGCACATTTTGTGTAT 

ttrtaaaactaacgccttgcatg gttcacaacccatttctta tgcctgtgttttccttggca 
gcaaaatttctgtggttcctcctactccacctcctgtcagcgagagccagtgcagccgcagt 
cctggcaggaaggtcagtgcaccagatattctgaaacctctcaatcaagaggatcccaaatg 
rtrtartaapnr.tattttg aagcaacagaatctcccatcca gtccggcacccagtaccatat 
tctctggaggttttagacacccaactttaattaccattgacagcacgtgtacagagSsgc 
aattttgacaatgct aatgtcactggagaaatagaatttgccattcattattgcttcaaaac 
ccattctttagaaatatgcatcaaggcctgtaagaaccttgcctatggagaagaaaagaaga 
aaaagtgcaatccgtatgtgaagacctacctgttgcccgacagatcctcccagggaaagcgc 

AAGACTGGAGTCCAAAGGAACACCGTGGACCCGACCTTTCAGGAGACCTTGAAGTATCAGGT 
GGCCCCTGCCCAGCTGGTGACCCGGCAGCTGCAGGTCTCGGTGTGGCATCTGGGCACGCTGG 
CCCGGAGAGTGTTTCTTGGAGAAGTGATCATTCCTCTGGCCACGTGGGACTTTGAAGACAGC 
araararar.TrrTTPrr,PTr,r,rATCCflCTCCGGGCCA AGGCGGAGAAATACGAAGACA GCGT 
TCCTCAGAGTAATGGAGAGCTCACAGTCCGGGCTAAGCTGGTTCTCCCTTCACGGCCCAGAA 
AACTCCAAGAGGCTCAAGAAGGGACAGATCAGCCATCACTTCATGGTCAACTTTGTTTGGTA 
GTGCTAGGAGCCMGAATTTACCTGTGCGGCCAGATGGCACCTTGAACTCATTTGTTAAG^ 
CTGTCTCACTCTGC CAGACCAACAAAAA CTGAGACTGAAGTCGCCAGTCCTGAGGAAG CAGG 
CTTGCCCCCAGTGGAAACACTCATTTGTCTTCAGTGGCGTAACCCCAGCTCAGCTGAGGCAG 
T rr.ar.PTTP.CAr.'P'PaarTfiTrTflRGATCAGGCCCT CTTTGGAATGAACGACCGCTTG CTTGG 
AGGAACCAGACTTGGTTCAAAGGGAGACACAGCTGTTGGCGGGGATGCATGCTCACAATCGA 
AGCTCCAGTGGCAGAAAGTCCTTTCCAGCCCCAATCTATGGACAGACATGACTCTTGTCCTG 
CACTGACATGAAGGCCTCAAGGTTCCAGGTTGCAGCAGGCGTGAGGCACTGTGCGTCTGCAG 
AGGGGCTACGAACCAGGTGCAGGGTCCCAGCTGGAGACCCCTTTGACCTTGAGCAGTCTCCA 
TCTGCGGCCCTGTCCCATGGCTTAACCGGCTATTGGTATCTGTGTATATTTACGTTAAACAC 
auttatrttapptaaRPP.T PTGGTGGGTTATCTCCTCTT TGAGATGTAGAAAATGGCCAGAT 



: the protein 

MGNFDNANVTGEIEFAIHYCFKTHSLEICIKACKNLAYGEEKKKKCNPYVKTYLLPDRSSQG 
KRKTGVQRNTVDPTFQETLKYQVAPAQLVTRQLQVSVWHLGTLARRVFLGEVIIPLATWDFE 
DSTTQSFRWHPLRAKAEKYEDSVPQSNGELTVRAKLVLPSRPRKLOEAQEGTDQPSLHGQLC 
LWLGAKNLPVRPDGTLNSFVKGCLTLPDQQKLRLKSPVLRKQACPQWKHSFVFSGVTPAQL 
RQSSLELTVWDQALFGMNDRLLGGTRLGSKGDTAVGGDACSQSKLQWQKVLSSPNLWTDMTL 

VLH 

FIG. 10 
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COMPOSITIONS WHICH CAN BE USED FOR REGULATING THE ACTIVITY 

OF PARKIN 

The present invention relates to compositions and methods which can 
be used for regulating the activity of parkin. It relates in particular to a novel protein, 
5 referred to as PAP1, which is a partner of parkin, as well as to the peptides or 
polypeptides which are derived from or are homologous to this protein. It also relates 
to compounds which are capable of modulating, at least partially, the activity of 
parkin, in particular of interfering with the interaction between parkin and PAP1. The 
present invention can be used in the therapeutic or diagnostic areas, or for forming 
10 pharmacological targets which make possible the development of novel drugs. 

The parkin gene is mutated in certain familial forms (autosomal 
recessive juvenile) of Parkinson's disease (Kitada etal, 1998). Parkinson's disease 
(Lewy, 1912) is one of the most common neurodegenerative diseases, affecting more 
15 than 1% of the population over 55 years old. Patients suffering from this disease have 
neurological disorders which are grouped together under the term "Parkinsonian 
syndrome," which is characterized by rigidity, bradykinesia, and resting tremor. These 
symptoms are the consequence of a degeneration of the dopaminergic neurons of the 
substantia nigra of the brain. 

20 

Most cases with a Parkinson's disease do not have a familial history. 
However, familial cases do exist, of which certain correspond to a monogenic form of 
the disease. At the current time, only three different genes have been identified in 
certain rare hereditary forms. The first form corresponds to an autosomal dominant 

25 form, in which the gene responsible encodes alpha Synuclein (Polymeropoulos et aL, 
1997). This protein is an abundant constituent of the intracytoplasmic inclusions, 
termed Lewy bodies, which are used as a marker for Parkinson's disease (Lewy, 
1912). The second form, also autosomal dominant, is linked to a mutation in a gene 
which encodes a hydrolase termed ubiquitin carboxy-terminal hydrolase LI (Leroy et 

3 0 a/., 1998). This 




enzyme is assumed to hydrolyze ubiquitin polymers or conjugates into ubiquitin 
monomers. The third form differs from the previous forms in that it has an autosomal 
recessive transmission and onset which often occurs before 40 years of age, as well as 
an absence of Lewy bodies. These patients respond more favorably to levodopa, a 
5 dopamine precursor which is used as treatment for Parkinson's disease. The gene 
involved in this form encodes a novel protein which is termed parkin (Kitada et al., 
1998). 

The parkin gene consists of 12 exons which cover a genomic region of 
more than 500,000 base pairs on chromosome 6 (6q25.2-q27). At the current time, two 

1 0 major types of mutation of this gene, which are at the origin of the disease, are known; 
either deletions of variable size in the region which covers exons 2 to 9, or point 
mutations which produce the premature appearance of a stop codon or the change of 
an amino acid (Kitada et al., 1998; Abbas et al, 1999; Lucking et a/., 1998; Hattori et 
aL, 1998). The nature of these mutations and the autosomal recessive method of 

15 transmission suggest a loss of function of the parkin, which leads to Parkinson's 
disease. 

This gene is expressed in a large number of tissues and in particular in 
the substantia nigra. Several transcripts which correspond to this gene and originate 

2 0 from different alternative splicing sites Kitada et al, 1998; Sunada et al, 1998) exist. 
In the brain, two types of messenger RNAs are found, of which one lacks the portion 
corresponding to exon 5. In the leukocytes, parkin messenger RNAs which do not 
contain the region encoding exons 3, 4 and 5 have been identified. The longest of the 
parkin messenger RNAs, which is present in the brain, contains 2960 bases and 

2 5 encodes a protein of 465 amino acids. 



This protein has a slight homology with ubiquitin in its N-terminal 
portion. Its C-terminal half contains two ring finger motifs, separated by an IBR (In 
Between Ring) domain, which correspond to a cysteine-rich region and which are able 



30 



% 



3 

to bind metals, like the zinc finger domains (Morett, 1999). It has been shown by 
immunocytochemistry that parkin is located in the cytoplasm and the Golgi apparatus 
of neurons of the substantia nigra which contain melanin (Shimura et aL, 1999). In 
addition, this protein is present in certain Lewy bodies of Parkinsonians. The cellular 
5 function of parkin has not yet been demonstrated, but it might play a transporter role 
in synaptic vesicles, in the maturation or degradation of proteins, and in the control of 
cellular growth, differentiation or development. In the autosomal recessive juvenile 
forms, parkin is absent, which thus confirms that the loss of this function is 
responsible for the disease. 

10 

The elucidation of the exact role of the parkin protein in the process of 
degeneration of the dopaminergic neurons thus constitutes a major asset for the 
understanding of and the therapeutic approach to Parkinson's disease, and more 
generally diseases of the central nervous system. 

1 5 The present invention lies in the identification of a partner of parkin, 

which interacts with this protein under physiological conditions. This partner 
represents a novel pharmacological target for manufacturing or investigating 
compounds which are capable of modulating the activity of parkin, in particular its 
activity on the degeneration of dopaminergic neurons and/or the development of 

2 0 nervous pathologies. This protein, the antibodies, the corresponding nucleic acids, as 
well as the specific probes or primers, can also be used for detecting or assaying the 
proteins in biological samples, in particular nervous tissue samples. These proteins or 
nucleic acids can also be used in therapeutic approaches, for modulating the activity of 
parkin and any compound according to the invention which is capable of modulating 

2 5 the interaction between parkin and the polypeptides of the invention. 

The present invention results more particularly from the demonstration 
by the applicant of a novel human protein, referred to as PAP1 (Parkin 



4 



Associated Protein 1), or LY111, which interacts with parkin. The PAP1 protein 
(sequence SEQ ID NO: 1 or 2) shows a certain homology with synaptotagmins and is 
capable of interacting more particularly with the central region of parkin (represented 
on the sequence SEQ ID NO: 3 or 4). The PAP1 protein has also been cloned, 
5 sequenced and characterized from various tissues of human origin, specifically lung 
(SEQ ID NO: 12, 13) and brain (SEQ ID NO: 42, 43) tissue, as well as short forms, 
which correspond to splicing variants (SEQ ID NO: 14, 15, 44, 45). 

The present invention also results from the identification and 
characterization of specific regions of the PAP1 protein which are involved in the 
1 0 modulation of the function of parkin. The demonstration of the existence of this 
1 protein and of regions which are involved in its function makes it possible in 

S particular to prepare novel compounds and/or compositions which can be used as 

iQ pharmaceutical agents, and to develop industrial methods of screening such 

compounds. 

! 5 A first subject of the invention thus relates to compounds which are 

capable of modulating, at least partially, the interaction between the PAP1 protein (or 
homologs thereof) and parkin (in particular human parkin), or of interfering with the 
interaction between these proteins. 

Another subject of the invention lies in the PAP1 protein and 
2 0 fragments, derivatives and homologs thereof. 

Another aspect of the invention lies in a nucleic acid which encodes 
the PAP1 protein or fragments, derivatives or homologs thereof, as well as any vector 
which comprises such a nucleic acid and any recombinant cell which contains such a 
nucleic acid or vector, and any non-human mammal comprising such a nucleic acid in 

2 5 its cells. 

The invention also relates to antibodies which are capable of binding 
the PAP1 protein and fragments, derivatives and homologs thereof, in particular 
polyclonal or monoclonal antibodies, more preferably antibodies which are capable of 
binding the PAP1 protein and of inhibiting, at least partially, its interaction with 

3 0 parkin. 



5 



Another aspect of the invention relates to nucleotide probes or primers, 
which are specific to PAP1 and which can be used for detecting or amplifying the 
PAP1 gene, or a region of this gene, in any biological sample. 
5 The invention also relates to pharmaceutical compositions, methods 

for detecting genetic abnormalities, methods for producing polypeptides as defined 
above and methods for screening or for characterizing active compounds. 

As indicated above, a first aspect of the invention lies in a compound 
which is capable of interfering, at least partially, with the interaction between the 
10 PAP1 protein (or homologs thereof) and parkin. 

For the purposes of the present invention, the name PAP1 protein 
I'O refers to the protein per se, as well as to all homologous forms thereof. "Homologous 

: <ft form" is intended to refer to any protein which is equivalent to the protein under 

! ! 0 consideration, of varied cellular origin and in particular derived from cells of human 

111 

1 5 origin, or from other organisms, and which possesses an activity of the same type. 
i;0 Such homologs also comprise natural variants of the PAP1 protein of sequence SEQ 

ID NO 2, in particular polymorphic or splicing variants. Such homologs can be 
! obtained by experiments of hybridization between the coding nucleic acids (in 
11 particular the nucleic acid of sequence SEQ ID NO: 1). For the purposes of the 

) i 2 0 invention, a sequence of this type only has to have a significant percentage of identity 

to lead to a physiological behavior which is comparable to that of the PAP1 protein as 
claimed. "Significant percentage of identity" is intended to refer to a percentage of at 
least 60%, preferably 80%, more preferably 90% and even more preferably 95%. As 
such, variants and/or homologs of the sequence SEQ ID NO: 2 are described in the 
2 5 sequences SEQ ID NO: 13, 15, 43 and 45, and are identified from tissues of human 
origin. The name PAP1 therefore also encompasses these polypeptides. 
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For the purposes of the present invention, the "percentage of identity" between two 
sequences of nucleotides or amino acids can be determined by comparing two 
optimally aligned sequences through a window of comparison. 
5 The part of the nucleotide or polypeptide sequence in the window of 

comparison can thus comprise additions or deletions (gaps, for example) as compared 
to the reference sequence (which does not contain these additions or deletions) such 
that an optimal alignment of the two sequences is obtained. 

The percentage is calculated by determining the number of positions at which 
10 a nucleic acid base or identical amino acid residue is observed for the two sequences 
(nucleic acid or peptide) being compared, then dividing the number of positions at 
which there is identity between the two amino acid residues or bases by the total 
0 number of positions in the window of comparison, then multiplying the result by 100 

so as to obtain the sequence identity percentage. 
1 5 Optimal alignment of the sequences for purposes of the comparison 

can be performed on a computer using known algorithms contained in the Wisconsin 
Genetics Software Package, produced by Genetics Computer Group (GCG), 575 
Science Dr., Madison, Wisconsin. 

For purposes of illustration, the sequence identity percentage may be 
2 0 obtained with the BLAST software (BLAST versions 1 .4.9 of March 1996, 2.0.4 of 
February 1998 and 2.0.6 of September 1998) using only the default parameters 
(Altschul et al, /. Mol Biol (1990) 215:403-410; Altschul et al, Nucleic Acids Res. 
(1997) 25: 3389-3402). Blast searches for sequences which are similar/homologous to 
a reference "query" sequence, using the Altschul et al algorithm (above). The query 

2 5 sequence and the databases used can be peptide or nucleic acid, with any combination 
being possible. 

The interference of a compound according to the invention can reveal 
itself in various ways. Thus, the compound can slow, inhibit or stimulate, at least 
partially, the interaction between the PAP1 protein, or a homologous form thereof, and 

3 0 parkin. Preferably, they are compounds which are capable of modulating this 





interaction in vitro, for example in a double-hybrid type system or in any acellular 
system for detecting an interaction between two polypeptides. The compounds 
according to the invention are preferably compounds which are capable of modulating, 
at least partially, this interaction, preferably of increasing or inhibiting this reaction by 
5 at least 20%, more preferably by at least 50%, as compared to a control in the absence 
of the compound. 

In a particular embodiment, they are compounds which are capable of 
interfering with the interaction between the region of parkin which is represented on 
the sequence SEQ ID NO: 4 and the region of the PAP1 protein which is represented 
10 on the sequence SEQ ID NO: 2, 13, 15, 43 or 45. 

According to a particular embodiment of the invention, the compounds 
are capable of binding at the domain of interaction between the PAP1 protein, or a 
homologous form thereof, and parkin. 

The compounds according to the present invention can be varied in 
1 5 nature and in origin. In particular, they can be compounds of peptide, nucleic acid (i.e. 
comprising a string of bases, in particular a DNA or an RNA molecule), lipid or 
saccharide type, an antibody, etc. and, more generally, any organic or inorganic 
molecule. 

According to a first variation, the compounds of the invention are 
2 0 peptide in nature. The term "peptide" refers to any molecule comprising a string of 

amino acids, such as for example a peptide, a polypeptide, a protein or an antibody (or 
antibody fragment or derivative), which, if necessary, is modified or combined with 
other compounds or chemical groups. In this respect, the term "peptide" refers more 
specifically to a molecule comprising a string of at most 50 amino acids, more 
2 5 preferably of at most 40 amino acids. A polypeptide (or a protein) preferably 
comprises from 50 to 500 amino acids, or more. 

According to a first preferred embodiment, the compounds of the 
invention are peptide compounds comprising all or part of the peptide sequence SEQ 
ID NO: 2 or a derivative thereof, in particular all or part of the peptide sequence SEQ 
30 ID NO: 13, 15, 43 or 45 or derivatives of these sequences, more 
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particularly of the PAP1 protein, which comprises the sequence SEQ ID NO: 2, 13, 
15,43 or 45. 

For the purposes of the present invention, the term "derivative" refers 
to any sequence which differs from the sequence under consideration because of a 
5 degeneracy of the genetic code, which is obtained by one or more modifications of 
genetic and/or chemical nature, as well as any peptide which is encoded by a sequence 
which hybridizes with the nucleic acid sequence SEQ ID NO: 1, or a fragment of this 
sequence, for example with the nucleic acid sequence SEQ ID NO: 12, 14, 42 or 44 or 
a fragment of these sequences, and which is capable of interfering with the interaction 
1 0 between the PAP1 protein, or a homolog thereof, and parkin. "Modification of genetic 
and/or chemical nature" can mean any mutation, substitution, deletion, addition and/or 
modification of one or more residues. The term "derivative" also comprises the 
sequences which are homologous to the sequence under consideration, which are 
derived from other cellular sources and in particular cells of human origin, or from 
! p 1 5 other organisms, and which possess an activity of the same type. Such homologous 

£ :0 sequences can be obtained by hybridization experiments. The hybridizations can be 

j\ carried out with nucleic acid libraries, using the native sequence or a fragment of this 

□ sequence as probe, under varied conditions of hybridization (Maniatis et al, 1989). 
U Moreover, the term "fragment" or "part" refers to any portion of the molecule under 

□ 2 0 consideration, which comprises at least 5 consecutive residues, preferably at least 9 

; ™ consecutive residues, even more preferably at least 15 consecutive residues. Typical 

fragments can comprise at least 25 consecutive residues. 

Such derivatives or fragments can be generated with different aims, 
such as in particular that of increasing their therapeutic effectiveness or of reducing 

2 5 their side effects, or that of conferring novel pharmacokinetic and/or biological 

properties thereon. 

As a peptide which is derived from the PAP1 protein and from the 
homologous forms, mention may be made in particular of any peptide which is 
capable of interacting with parkin, but which bears an effector region which has been 

3 0 made nonfunctional. Such peptides can be obtained 



by deletion, mutation or disruption of this effector region on the PAP1 protein and 
homologous forms. Such modifications can be carried out for example by in vitro 
mutagenesis, by introducing additional elements or synthetic sequences, or by 
deletions or substitutions of the original elements. When such a derivative as defined 
5 above is prepared, its activity as partial inhibitor of the binding of the PAP1 protein, 
and of the homologous forms on its binding site on parkin, can be demonstrated. Any 
technique known to one skilled in the art can of course be used for this purpose. 

They can also be fragments of the sequences indicated above. Such 
fragments can be generated in various ways. In particular they can be synthesized 
10 chemically, on the basis of the sequences given in the present application, using the 
peptide synthesizers known to one skilled in the art. They can also be synthesized 
genetically, by expression in a host cell of a nucleotide sequence which encodes the 
desired peptide. In this case, the nucleotide sequence can be prepared chemically 
using an oligonucleotide synthesizer, on the basis of the peptide sequence given in the 
1 5 present application and of the genetic code. The nucleotide sequence can also be 
prepared from sequences given in the present application, by enzymatic cleavage, 
ligation, cloning, etc., according to the techniques known to one skilled in the art, or 
by screening DNA libraries with probes which are developed from these sequences. 

Moreover, the peptides of the invention, i.e., which are capable of 
2 0 modulating, at least partially, the interaction between the PAP1 protein, and 
homologous forms, and parkin, can also be peptides which have a sequence 
corresponding to the site of interaction of the PAP1 protein and of the homologous 

forms on parkin, 

Other peptides according to the invention are peptides which are 
2 5 capable of competing with the peptides defined above for the interaction with their 
cellular 
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target. Such peptides can be synthesized in particular on the basis of the sequence of 
the peptide under consideration, and their capacity for competing with the peptides 
defined above can be determined. 

A specific subject of the present invention relates to the PAP1 protein. 
It is more particularly the PAP1 protein comprising the sequence SEQ ID NO: 2 or a 
fragment or derivative of this sequence, for example the PAP1 protein, sequence SEQ 
ID NO: 13, 15, 43, 45 or fragments of these sequences. 

Another subject of the invention lies in polyclonal or monoclonal 
antibodies or antibody fragments or derivatives, which are directed against a 
polypeptide as defined above. Such antibodies can be generated by methods known to 
one skilled in the art. In particular, these antibodies can be prepared by immunizing an 
animal against a peptide compound of the invention (in particular a polypeptide or a 
peptide comprising all or part of the sequence SEQ ID NO: 2), sampling the blood and 
isolating the antibodies. These antibodies can also be generated by preparing 
hybridomas according to the techniques known to one skilled in the art. 

More preferably, the antibodies or antibody fragments of the invention 
have the capacity to modulate, at least partially, the interaction of the claimed peptides 
with parkin. 

Moreover, these antibodies can also be used for detecting and/or 
assaying the expression of PAP1 in biological samples and, consequently, for 
providing information on its activation state. 

The antibody fragments or derivatives are for example Fab or Fab' 2 
fragments, single-chain antibodies (ScFv), etc. They are in particular any fragment or 
derivative which retains the antigenic specificity of the antibodies from which they are 
derived. 

The antibodies according to the invention are more preferably capable 
of binding the PAP1 proteins which comprise the sequence SEQ ID NO: 2, 13, 43 or 
45, in particular 



the region of this protein which is involved in the interaction with parkin. These 
antibodies (or fragments or derivatives) are more preferably capable of binding an 
epitope which is present in the sequence between residues 1 and 344 of the sequence 
SEQ ID NO: 2. 

5 The invention also relates to compounds which are not peptide or not 

exclusively peptide, which can be used as a pharmaceutical agent. It is in fact possible, 
from the active protein motifs described in the present application, to prepare 
molecules which are modulators of the activity of PAP1, are not exclusively peptide, 
and are compatible with pharmaceutical use, in particular by duplicating the active 
1 0 motifs of the peptides with a structure which is not a peptide, or which is not of 
exclusively peptide nature. 

A subject of the present invention is also any nucleic acid which 
encodes a peptide compound according to the invention. It can be, in particular, a 
nucleic acid comprising all or part of the sequence which is presented in SEQ ID NO: 
15 1, 12, 14, 42 or 44 or a derivative thereof. For the purposes of the present invention, 
"derived sequence" is intended to mean any sequence which hybridizes with the 
sequence which is presented in SEQ ID NO: 1, or with a fragment of this sequence, 
and which encodes a peptide compound according to the invention, as well as the 
sequences which result from the latter by degeneracy of the genetic code. For 
2 0 example, nucleic acids according to the invention comprise all or part of the nucleic 
sequence SEQ ID NO: 12, 14, 42 or 44. 

Moreover, the present invention relates to sequences which have a significant 
percentage of identity with the sequence presented in SEQ ID NO: 1 or with a 
fragment thereof and which encodes a peptide compound with physiological behavior 
2 5 which is comparable to that of the PAP1 protein. "Significant percentage of identity" 
is intended to mean a percentage of at least 60%, preferably 80%, more preferably 
90% and even more preferably 95%. 

The various nucleotide sequences of the invention may or may not be of 
artificial origin. They can be genomic, cDNA or RNA sequences, 
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hybrid sequences or synthetic or semi-synthetic sequences. These sequences 
can be obtained either by screening DNA libraries (cDNA library, genomic DNA 
library), by chemical synthesis, by mixed methods which include the chemical or 
enzymatic modification of sequences which are obtained by screening of libraries, or 
5 by searching for homology in nucleic acid or protein databases. The abovementioned 
hybridization is preferably carried out under the conditions described by Sambrook et 
al. (1989, pages 9.52-9.55). 

It is advantageously carried out under highly stringent hybridization conditions. For 
the purposes of the present invention, "highly stringent hybridization conditions" is 
1 0 intended to mean the following conditions: 

1- Competition of the membranes and PRE-H YBRTPIZATION: 

- Mix: 40^x1 salmon sperm DNA (10 mg/ml) 
+ 40^x1 human placenta DNA (10 mg/ml) 

- Denature for 5 min. at 96°C, then immerse the mixture in ice. 

- Remove the SSC 2X buffer and pour 4 ml formamide mix into the 
hybridization tube which contains the membranes. 

- Add the mixture of the two denatured DNAs. 

- Incubate at 42°C for 5 to 6 hours, with rotation. 

2- Competition of the labeled probe: 

- Add to the labeled and purified probe 10 to 50ul Cot I DNA, according 
to the quantity of non-specific hybridizations. 

- Denature 7 to 10 min. at 95°C 

- Incubate at 65°C for 2 to 5 hours. 

3 - Hybridization: 

- Remove the pre-hybridization mix 

- Mix 40 |xl salmon sperm DNA + 40 \i\ human placenta DNA; denature 
5 min. at 96°C, then immerse in ice. 
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- Add to the hybridization tube 4 ml formamide mix, the mixture of the 
two DNAs and the labeled probe/denatured Cot I DNA. 

- Incubate 15 to 20 hours at 42°C, with rotation. 

5 4-Washes: 

- One wash at room temperature in SSC 2X, to rinse. 

- 2 times 5 minutes at room temperature SSC 2X and SDS 0.1%. 

- 2 times 15 minutes SSC 0.1X and SDS 0.1% at 65°C 
Wrap membranes in Saran and expose. 

10 The hybridization conditions described above are suitable for hybridization 

under highly stringent conditions of a nucleic acid molecule varying in length from 20 

I 

3 nucleotides to several hundred nucleotides. 

f The hybridization conditions described above could of course be adjusted to 

5 take i nto aC count the length of the nucleic acid for which hybridization is desired or 

j 1 5 the type of label chosen, according to techniques known to one skilled in the art. 

8 For example, the suitable hybridization conditions can be adjusted according 

to the teachings contained in the work of Hames and Higgins (1985) (Nucleic Acid 
3 Hybridization a Practical Approach, Hames and Higgins Ed., IRL Press, Oxford) or, 

!J alternatively, in the work of F. Ausubel et al (1999) (Current Protocols in Molecular 

3 2 0 Biology, Green Publishing Associates and Wiley Interscience, NY). 

For the purposes of the invention, a particular nucleic acid encodes a 
polypeptide comprising the sequence SEQ ID NO: 2 or a fragment or derivative of this 
sequence, in particular the human PAP1 protein. It is advantageously a nucleic acid 
comprising the sequence SEQ ID NO: 1, 12, 14, 42 or 44. 
2 5 Such nucleic acids can be used for producing the peptide compounds 

of the invention. The present application thus relates to a method for preparing such 
peptide compounds, according to which a cell which contains a nucleic acid according 
to the invention is cultured under conditions for expressing said 
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nucleic acid, and the peptide compound produced is recovered. In this case, the 
portion which encodes said peptide compound is generally placed under the control of 
signals which allow its expression in a host cell. The choice of these signals 
(promoters, terminators, secretion leader sequence, etc.) can vary as a function of the 
5 host cell used. Moreover, the nucleic acids of the invention can form part of a vector 
which can replicate autonomously or can integrate. More particularly, autonomously- 
replicating vectors can be prepared using sequences which replicate autonomously in 
the chosen host. As regards the integrating vectors, they can be prepared for example 
using sequences which are homologous to certain regions of the genome of the host, 
1 0 which allow, by homologous recombination, the integration of the vector. It can be a 
vector of plasmid, episomal, chromosomal, viral etc., type. 
1 The host cells which can be used for producing the peptide compounds 

5j of the invention via the recombinant pathway are both eukaryotic and prokaryotic 

S hosts. Among the eukaryotic hosts which are suitable, mention may be made of animal 

5 1 5 cells, yeasts or fungi. In particular, as regards yeasts, mention may be made of the 

0 yeasts of the genus Saccharomyces, Kluyveromyces, Pichia, Schwanniomyces, or 

Hansenula. As regards animal cells, mention may be made of COS, CHO, C127, 

□ PC12 etc., cells. Among the fungi, mention may be made more particularly of 
Aspergillus ssp. or Trichoderma ssp. As prokaryotic hosts, use of the following 

□ 2 0 bacteria is preferred: E. coli, Bacillus or Streptomyces. 

* * A subject of the present invention is also non-human mammals comprising in 

their cells a nucleic acid or vector according to the invention. 

Such mammals (rodents, canines, rabbits, etc.) can be used in particular to 
study the properties of PAP1 and identify compounds with therapeutic aims. The 
2 5 genome of such a transgenic animal can be modified by knock-in or knock-out 

alteration or modification of one or more genes. This modification can be carried out 
using conventional alterative or mutagenic agents, or via directed mutagenesis. 
Modification of the genome can also be the result of the insertion of a gene(s) or the 
replacement of a gene(s) 
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in its (their) wild or mutated form. Genome modifications are advantageously carried 
out on reproductive stem cells and advantageously on pronuclei. Transgenesis can be 
performed by microinjection of an expression cassette comprising the modified genes 
in the two fertile pronuclei. Thus an animal according to the present invention can be 
obtained by injection of an expression cassette comprising a nucleic acid. Preferably, 
this nucleic acid is a DNA which can be a genomic DNA (gDNA) or a complementary 
DNA (cDNA). 

The construction of transgenic animals according to the invention can be carried out 
according to conventional techniques well known to one skilled in the art. A person 
skilled in the art can in particular refer to the production of transgenic animals, and 
specifically to the production of transgenic mice, as described in the following patents 
US 4,873,191; US 5,464,764 and US 5,789,215; the contents of these documents are 
incorporated herein by reference. 

In short, a polynucleotide construct which comprises a nucleic acid according 
to the invention is inserted into an ES-type stem cell line. Insertion of the 
polynucleotide construct is preferably performed by electroporation, as described by 
Thomas et al. (1987, Cell, Vol. 51; 503-512). 

The cells which have been subjected to the electroporation step are then 
screened for the presence of the polynucleotide construct (for example by selection, 
0 using markers, or alternatively by PCR or by Southern-type DNA gel electrophoresis 
analysis) so as to select the positive cells which integrated the exogenous 
polynucleotide construct into their genome, if necessary after a homologous 
recombination event. Such a technique is described by Mansour et al, for example. 
(Nature (1988) 336: 348-352). 
5 The positively selected cells are then isolated, cloned and injected into 3.5 

day-old mouse blastocysts, as described by Bradley (1987, Production and Analysis of 
Chimaeric Mice. In: E.J. Robertson (Ed., Teratocarcinomas and Embryonic Stem 
Cells: A Practical Approach, IRL Press. Oxford, page 113)). Blastocysts are then 
introduced 
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into a female animal host and development of the embryo is pursued to full term. 

Alternatively, positively selected ES-type cells are placed in contact with 2.5 
day-old embryos at an 8-16 cell stage (momlae), as described by Wood et al. (1993. 
Proc. Natl. Acad. Sci. USA, vol. 90: 4582-4585) or by Nagy et al. (1993. Proc. Natl. 
5 Acad. Sci. USA, vol. 90: 8424-8428). The ES cells are internalized in order to 

extensively colonize the blastocyst, including the cells which produce the germ line. 

The descendants are then tested to determine those which have integrated the 
polynucleotide construct (the transgene). 

The nucleic acids according to the invention can also be used to prepare 
1 0 genetic antisense or antisense oligonucleotides which can be used as pharmaceutical 
agents. Antisense sequences are oligonucleotides of short length, which are 
complementary to the coding strand of a given gene, and consequently are capable of 
specifically hybridizing with the mRNA transcript, which inhibits its translation into a 
protein. A subject of the invention is thus antisense sequences which are capable of 
1 5 inhibiting, at least partially, the interaction of the PAP1 proteins on parkin. Such 

sequences can consist of all or part of the nucleic acid sequences defined above. They 
are generally sequences, or fragments of sequences, which are complementary to 
sequences encoding peptides which interact with parkin. Such oligonucleotides can be 
obtained by fragmentation, etc., or by chemical synthesis. 
2 o The claimed sequences can be used in the context of gene therapies, 

for transferring and expressing, in vivo, antisense sequences or peptides which are 
capable of modulating the interaction of the PAP1 protein with parkin. In this respect, 
the sequences can be incorporated in viral or nonviral vectors, which allows their 
administration in vivo (Kahn et al., 1991). As viral vectors in accordance with the 
2 5 invention, mention may be made most particularly of 
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adenovirus, retrovirus, adeno-associated virus (AAV) or herpes virus type vectors. A 
subject of the present application is also recombination-defective viruses comprising a 
nucleic acid which encodes a polypeptide according to the invention, in particular a 
polypeptide or peptide comprising all or part of the sequence SEQ ID NO: 2 or of a 
5 derivative of this sequence, for example all or part of the sequence SEQ ID NO: 12, 
14, 42 or 44 or derivatives of these sequences. 

The invention also enables the preparation of nucleotide probes, which 
may or may not be synthetic, and which are capable of hybridizing with the nucleotide 
sequences defined above or their complementary strand. Such probes can be used in 
1 0 vitro as a diagnostic tool for detecting the expression or overexpression of PAP1 , or 
alternatively for revealing genetic abnormalities (incorrect splicing, polymorphism, 
point mutations, etc.). These probes can also be used for detecting and isolating 
homologous nucleic acid sequences which encode peptides as defined above, from 
other cellular sources and preferably from cells of human origin. The probes of the 
1 5 invention generally comprise at least 10 bases, and they can for example comprise up 
to the whole of one of the abovementioned sequences or of their complementary 
strand. Preferably, these probes are labeled prior to their use. For this, various 
techniques known to one skilled in the art can be employed (radioactive, fluorescent, 
enzymatic, chemical labeling, etc.). 
2 o The invention also relates to primers or primer pairs which make it 

possible to amplify all or part of a nucleic acid encoding a PAP1, for example a 
sequence primer chosen from among SEQ ID NO: 16-41. 

A subject of the invention is also any pharmaceutical composition 
which comprises, as an active agent, at least one compound as defined above, in 
2 5 particular a peptide compound. 

A subject of the invention is in particular any pharmaceutical 
composition which comprises, as an active agent, at least one antibody and/or one 
antibody fragment as 



defined above, as well as any pharmaceutical composition which comprises, as an 
active agent at least one nucleic acid or one vector as defined above. 

A subject of the invention is also any pharmaceutical composition 
which comprises, as an active agent, a chemical molecule which is capable of 
increasing or of decreasing the interaction between the PAP1 protein and parkin. 

Moreover, a subject of the invention is also pharmaceutical 
compositions in which the peptides, antibodies, chemical molecules and nucleotide 
sequences defined above are combined mutually or with other active agents. 

The pharmaceutical compositions according to the invention can be 
used for modulating the activity of the parkin protein, and consequently for 
maintaining the survival of the dopaminergic neurons. More particularly, these 
pharmaceutical compositions are intended for modulating the interaction between the 
PAP1 protein and parkin. They are, more preferably, pharmaceutical compositions 
which are intended for treating diseases of the central nervous system, such as for 
example Parkinson's disease. 

A subject of the invention is also the use of the molecules described 
above for modulating the activity of parkin or for the typing of diseases of the central 
nervous system. In particular, the invention relates to the use of these molecules for 
modulating, at least partially, the activity of parkin. 

The invention also relates to a method for screening or characterizing 
molecules which act on the function of parkin, to include selecting molecules which 
are capable of binding the sequence SEQ ID NO: 2 or the sequence SEQ ID NO: 4, or 
a fragment (or derivative) of these sequences. The method comprises, advantageously, 
bringing the molecule(s) to be tested into contact, in vitro, with a polypeptide which 
comprises the sequence SEQ ID NO: 2 or the sequence SEQ ID NO: 4, or a fragment 
(or derivative) of these sequences, and selecting molecules which are capable of 
binding the sequence SEQ ID NO: 2 (in particular the region between residues 1 and 
344) or the sequence SEQ ID NO: 4. The molecules tested can be varied in nature 
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(peptide, nucleic acid, lipid, sugar, etc., or mixtures of such molecules, for example 
combinatory libraries, etc.). As indicated above, the molecules thus identified can be 
used for modulating the activity of the parkin protein, and represent potential 
therapeutic agents for treating neurodegenerative pathologies. 

Other advantages of the present invention will appear upon reading the 
following examples and figure, which should be considered as illustrative and 
nonlimiting. 



LEGENDS TO THE FIGURE: 
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Figure 1: Representation of the vector pLex9-parkin (135-290) 
Figure 2: Results of the first 5'-RACE experiment. 8 clones were obtained. The 
initial sequence is indicated on the lower part of the figure. 
Figure 3: Results of the second 5'-RACE experiment. Only two of the 8 clones 
1 5 obtained in the first experiment were validated (clones A12 and D5). The initial 
sequence is indicated on the lower part of the figure. The complete sequence of 
DNAs and proteins is provided in Sequences 12-15. 

Figure 4: Detailed view of the organization of clones C5 and D4 from the second 5'- 
RACE experiment. The resulting consensus sequence is indicated on the upper part of 

20 the figure. 

Figure 5: Structure of transcripts isolated from human brain. 

Figure 6: LY1 1 1 (full length) nucleic acid and protein sequence from human brain. 
Double underlined: cysteines retained from zinc finger domain. Bold: Domain C 2 1- 
Italics: domain C 2 2, 

2 5 Figure 7: LY1 1 1 (short version) nucleic acid and protein sequence from human brain. 
Double underlined: cysteines retained from zinc finger domain. Bold: Domain C 2 1. 
Italics: domain C 2 2. 

Figure 8: Location of short (8b) or full length (8a) LY1 1 1 protein after expression in 
Cos-7 cells. 



30 




20 



Figure 9: LY11 1 (full length) nucleic acid and protein sequence from human lung. 
Figure 10: LY11 1 (short version) nucleic acid and protein sequence from human 
brain. 

MATERIALS AND TECHNIQUES USED 
1) Yeast strains: 

Strain L40 of the genus S. cerevisiae (Mata, his3D200, trpl-901, leu2- 
3, 112, ade2, LYS2:: (lexAop) 4 -HIS3, URA3::(lexAop) 8 -LacZ, GAL4, GAL80) was 
used to verify the protein-protein interactions when one of the protein partners is fused 
to the LexA protein. The LexA protein is capable of recognizing the LexA response 
element, which controls the expression of the reporter genes LacZ and His3. 

It was cultured on the following culture media: 
rnm pl^tp. YPD medium : - Yeast extract (10 g/1) (Difco) 

- Bactopeptone (20 g/1) (Difco) 

- Glucose (20 g/1) (Merck) 
This medium was solidified by addition of 20 g/1 of agar (Difco). 
Minima YNB medium : - Yeast Nitrogen Base (without amino acids) 
(6.7 g/1) (Difco) 

- Glucose (20 g/1) (Merck) 

This medium can be solidified by addition of 20 g/1 of agar (Difco). It can also be 
supplemented with amino acids and/or with 3-amino-l,2,4-triazole by addition of 
CSM media [CSM-Leu, -Trp, -His (620 mg/1), CSM-Trp (740 mg/1) or CSM-Leu, 
-Trp (640 mg/l)(Biol01)] and/or of 2.5 mM 3-amino-l,2,4-triazole. 

2) Bacterial strains: 

Strain TGI of Escherichia coli, of genotype supE, hsdA5, thi, A(lac- 

proAB), F' [tra D36 pro A + B + lacl" lacZAM15], was used for constructing 
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plasmids, as a means of amplifying and of isolating recombinant plasmids used. It was 

cultured on the following medium: 

Medium LB: - NaCl (5g/l) (Prolabo) 

- Bactotryptone (10 g/1) (Difco) 
5 - Yeast extract (5 g/1) (Difco) 

This medium is solidified by addition of 15 g/1 of agar (Difco). 

Ampicillin was used at 100 M-g/ml; this antibiotic is used to select the 
bacteria, which have received the plasmids bearing the gene for resistance to this 
antibiotic, as a marker. 

1 o Strain HB 101 of Escherichia coli of genotype supE44, aral4, galK2, 

lacYl, A(gpt-proA)62, rpsL20(Str r ), xyl-5, mtl-1, recA13, A(mcrC-mrr), HsdS (r m ) 
was used as means for amplifying and isolating plasmids which originate from the 
human lymphocyte cDNA library. 
It was cultured on 
1 5 Medium M9: -Na 2 HP0 4 (7 g/1) (Prolabo) 

-KH 2 P0 4 (3 g/1) (Prolabo) 
-NH4CI (1 g/1) (Prolabo) 
-NaCl (0.5 g/1) (Prolabo) 
-Glucose (20 g/1) (Sigma) 
2 0 -MgS0 4 (1 mM) (Prolabo) 

-Thiamine (0.001%) (Sigma) 
This medium is solidified by addition of 15 g/1 of agar (Difco). 
Leucine (50 mg/1) (Sigma) and proline (50 mg/1) (Sigma) should be added to the M9 
medium to enable the growth of strain HB101. 
2 5 During the selection of plasmids which originate from the lymphocyte 

cDNA two-hybrid library, leucine was not added to the medium because the plasmids 
bear a Leu2 selection marker. 
3) Plasmids: 
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The 5-kb vector P Lex9 (pBTMl 16) (Bartel et al., 1993), which is 
homologous to pGBTIO and which contains a multiple cloning site located 
downstream of the sequence which encodes the LexA bacterial repressor, and upstream 
5 of a terminator, for forming a fusion protein. 

P Lex-HaRasVall2; plasmid P Lex9, as described in application WO 
98/21327, which contains the sequence encoding the HaRas protein mutated at 
position Vall2, which is known to interact with the mammalian Raf protein (Vojtek et 
al., 1993). This plasmid was used to test the specificity of interaction of the PAPl 

10 protein in strain L40. 

P Lex9-cAPP; plasmid P Lex9 which contains the sequence encoding the 
cytoplasmic domain of the APP protein, known to interact with the PTB2 domain of 
FE65. This plasmid was used to test the specificity of interaction of the PAPl protein 

in strain L40. 
15 4) Synthetic oligonucleotides: 

TTAAGAATTC GGAAGTCCAG CAGGTAG (SEQ ID N°5) 

ATTAGGATCC CTACACACAA GGCAGGGAG (SEQ ID N°6) 

Oligonucleotides which made it possible to obtain the PCR fragment which 
corresponds to the central region of parkin, bordered by the EcoRJ and BamHI sites. 

GCGTTTGGAA TC ACT AC AG ( SE Q ID N ° 7 ^ 

GGTCTCGGTG TGGCATC (SEQ ID N°8) 

CCGCTTGCTT GGAGGAAC (SEQ ID N°9) 

CGTATTTCTC CGCCTTGG ( SE $ ID N ° 10) 

AATAGCTCGA GTCAGTGCAG GACAAGAG (SEQ ID N°l 1) 

Oligonucleotides which were used to sequence the insert corresponding to the PAPl 
gene. 

The oligonucleotides are synthesized using an Applied System ABI 
394-08 machine. They are removed from the synthesis matrix with ammonia and 
precipitated twice with 10 volumes of n-butanol, and then taken up in water. The 
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quantification is carried out by measuring the optical density (1 OD 260 corresponds to 
30 jig/ml). 

5) Preparation of plasmid DNAs 

The preparations of plasmid DNA were carried out according to the 
protocols recommended by Quiagen, the manufacturer of the DNA purification kits, in 
small and large amounts: 

- Quiaprep Spin Miniprep kit, reference: 27106 

- Quiaprep Plasmid Maxiprep kit, reference: 12613. 

6) Enzymatic amplification of DNA by PCR (Polymerase Chain Reaction): 

The PCR reactions are carried out in a final volume of 100 \i\ in the 
presence of the DNA matrix, of dNTP (0.2 mM), of PCR buffer (10 mM Tris-HCl pH 
8.5, 1 mM MgCl 2 , 5 mM KC1, 0.01% gelatin), of 10 to 20 pmol of each one of the 
oligonucleotides and of 2.5 IU of Ampli Taq DNA polymerase (Perkin Elmer). The 
mixture is covered with 2 drops of liquid petroleum jelly to limit the evaporation of 
the sample. The machine used is the "Crocodile II" by Appligene. 

We used a matrix denaturation temperature of 94°C, a hybridization 
temperature of 52°C and a temperature for elongation by the enzyme at 72°C. 

7) Ligations: 

All the ligation reactions are carried out at 37°C for one hour in a final 
volume of 20 uj, in the presence of 100 to 200 ng of vector, 0. 1 to 0.5 iig of insert, 40 
IU of T4 DNA ligase enzyme (Biolabs) and a ligation buffer (50 mM Tris-HCl pH 7.8; 
10 mM MgCl 2 ; 10 mM DTT; 1 mM ATP). The negative control consists of ligating 
the vector in the absence of insert. 

8) Transformation of bacteria: 
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The transformation of bacteria with a plasmid is carried out according 
to the following protocol: 10 »1 of the ligation volume are used to transform the TGI 
bacteria, according to the method of Chung (Chung et al, 1989). After transformatron, 
5 the bacteria are placed on an LB medium + ampicillin and incubated for 16 h at 37°C. 
9) Separation and extraction of DNAs: 

The separation of DNAs is carried out as a function of their size, on 
agarose gel by electrophoresis according to Maniatis (Maniatis et al, 1989): 1% 
agarose gel (Gibco BRL) in a TBE buffer (90 mM Tris base; 90 mM borate; 2 mM 
10 EDTA). 

■* 10) Fluorescent sequencing of plasmid DNAs: 

5 The sequencing technique used is derived from the method of Sanger 

0 (Sanger et al, 1977) and adapted for sequencing by fluorescence, which is developed 

[1 by Applied Biosystems. The protocol used is that described by the designers of the 

% 15 system (Perkin Elmer, 1997). 

0 11) Transformation of yeast: 

! * The plasmids are introduced into the yeast using a conventional 

5 technique for transforming yeast developed by Gietz (Gietz et al, 1992) and modified 

,& in the following way: 

In the specific case of the transformation of yeast with the lymphocyte 

cDNA library, the yeast used contains the plasmid P Lex9-parkin (135-290), which 
encodes the central portion of parkin fused to the LexA protein. It is cultured in 
200 ml of YNB minimum medium, supplemented with amino acids CSM-Trp, at 30°C 
with shaking until a density of 10 7 cells/ml is attained. To carry out the transformation 
of the yeasts, according to the above protocol, the cell suspension was separated into 
10 50-m tubes, into which 5 ug of the library were added. Heat shock was carried out 
for 20 minutes, and the cells were collected by centrifugation and resuspended in 
100 ml of YPD medium for 1 h at 30°C, and 
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in 100 ml of YNB medium, supplemented with CSM-Leu, -Trp, for 3 h 30 at 30°C. 
The efficiency of the transformation is determined by placing various dilutions of 
transformed cells on solid YNB medium which is supplemented with CSM-Trp, -Leu. 
After 3 days of culture at 30°C, the colonies obtained were counted, and the rate of 
transformation per u.g of lymphocyte library DNA was determined. 
12) Isolation of plasmids extracted from yeast: 

5 ml of a yeast culture, which is incubated for 16 h at 30°C, are centrifuged, 
and taken up in 200 |rf of a lysis buffer (1M Sorbitol, 0. 1 M KH 2 PCVK 2 HP0 4 pH 7.4, 
12.5 mg/ml zymolyase) and incubated for 1 h at 37°C. The lysate is then treated 
according to the protocol recommended by Quiagen, the manufacturer of the DNA 
purification kit, Quiaprep Spin Miniprep kit, ref 27106. 
13) B-galactosidase activity assay: 

A sheet of nitrocellulose is preplaced on the Petri dish containing the 
yeast clones, which are separated from each other. This sheet is then immersed in 
liquid nitrogen for 30 seconds, in order to rupture the yeasts and thus to release the B- 
galactosidase activity. After thawing, the sheet of nitrocellulose is placed, colonies 
facing upwards, in another Petri dish containing a Whatman paper which has been 
presoaked in 1.5 ml of PBS solution (60 mM Na 2 HP0 4 , 40 mM NaH 2 P0 4 , 10 mM 
KC1, 1 mM MgS0 4 , pH 7) containing 15 |d of X-Oal (5-bromo-4-chloro-3-indoyl-B- 
D-galactoside) at 40 mg/ml of N,N-dimethylformamide. The dish is then placed in an 
incubator at 37°C. The assay is termed positive when the colonies on the membrane 
turn blue after 12 hours. 

EXAMPLE 1: CONSTRUCTION OF A VECTOR WHICH ALLOWS THE 
EXPRESSION OF A FUSION PROTEIN IN WHICH FUSION IS BETWEEN THE 
CENTRAL PORTION OF PARKIN AND THE LEXA BACTERIAL REPRESSOR. 
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Screening a library using the double-hybrid system requires the central 
region of parkin to be fused to a DNA binding protein, such as the LexA bacterial 
repressor. The expression of this fusion protein is carried out using the vector P Lex9 
(cf. materials and methods), into which the sequence encoding the central region of 
parkin, which is in the sequence presented in sequence SEQ ID NO: 3 or 4, is 
introduced, in the same reading frame as the sequence corresponding to the LexA 
protein. 

The 468 bp-fragment of DNA corresponding to the 156 amino acids of 
the central region of parkin, which begins at amino acid 135, was obtained by PCR 
using the oligonucleotides (sequence SEQ ID NO: 5 and No. 6), which also made ,t 
possible to introduce the EcoRI site at the 5' end and a stop codon and a BamHl site at 
the 3' end. The PCR fragment was introduced between the EcoRI and BamHl sites of 
the multiple cloning site of the plasmid P Lex9, downstream of the sequence encoding 
the protein LexA, in order to produce the vector P Lex9-parkin (135-290) (Fig. 1). 

The construct was verified by sequencing the DNA. This verification 
made it possible to show that this fragment does not have mutations generated during 
the PCR reaction, and that it was fused in the same open reading frame as that of the 
fragment corresponding to LexA. 

EXAMPLE 2: SCREENING A LYMPHOCYTE FUSION LIBRARY 

We used the double-hybrid method (Fields and Song, 1989). 
Screening a fusion library makes it possible to identify clones 
producing proteins which are fused to the transactivating domain of GAL4, and which 
are able to interact with the protein of interest described in Example 1 (central region 
of parkin). This interaction makes it possible to reconstitute a transactor which 
will then be capable of inducing the expression of the reporter genes His3 and LacZ in 
strain L40. 

To carry out this screening we chose a fusion library which is prepared 
from cDNA originating from peripheral human lymphocytes, supplied by Richard 
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Benarous (Peytavi et al, 1999). Yeasts were transformed with the lymphocyte library 
and positive clones were selected as described below. 

During screening, it is necessary to maintain the probability that each 
separate plasmid from the fusion library is present in at least one yeast at the same 
time as the plasmid P Lex9-parkin (135-290). To maintain this probability, it is 
important to have a good efficiency of transformation of the yeast. For this, we chose a 
protocol for transforming yeast which gives an efficiency of 2.6 x 10 5 transformed 
cells per ug of DNA. In addition, as ^transforming yeast with two different plasmids 
reduces this efficiency, we preferred to use a yeast which is pretransformed with the 
plasmid P Lex9-parkin (135-290). This strain L40 pLe X 9-parkin (135-290), of 
phenotype His-, Lys-, Leu-, Ade-, was transformed with 50 og of plasmid DNA from 
the fusion library. This amount of DNA enabled us to obtain, after estimation, 1.3 x 
10 7 transformed cells, which corresponds to a number which is slightly higher than the 
number of separate plasmids which constitute the library. According to this result, 
virtually all of the plasmids of the library can be considered to have been used to 
transform the yeasts. The selection of the transformed cells, which are capable of 
reconstituting a functional transactivator, was done on a YNB medium which was 
supplemented with 2.5 mM 3-amino-l,2,4-triazole and 620 mg/1 of CSM (BiolOl), 
and which contains no histidine, no leucine and no tryptophan. 

At the end of this selection, many clones with a His+ phenotype were 
obtained. A fi-galactosidase activity assay was carried out on these transformants to 
validate, on the basis of the expression of the other reporter gene, LacZ, this number of 
obtained clones. 115 clones had the His+, P-Gal + double phenotype, which can 
correspond to a protein-protein interaction. 



EXAMPLE 3: ISOLATION OF THE LIBRARY PLASMIDS IN THE 
SELECTED. 
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To identify the proteins which are able to interact with the central 
region of parkin, the fusion library plasmids contained in the yeasts which were 
selected during the double-hybrid screening were extracted. To be able to obtain a 
large amount thereof, this isolation calls for a prior transformation of E. coli with an 
extract of DNA from the positive yeast strains. As the library plasmid which is 
contained in this extract is a yeast/E. coli shuttle plasmid, it can easily replicate in the 
bacterium. The library plasmid was selected by complementing the auxotrophic 
HB101 bacterium for leucine, on leucine-lacking medium. 

The plasmid DNAs from the bacterial colonies which are obtained 
after transformation with extracts of DNA from yeasts were analyzed by digestion 
Q with restriction enzymes and separation of the DNA fragments on agarose gel. Among 

jj the 1 15 clones analyzed, one clone containing a library plasmid, which showed a 

!| different profile from the others, was obtained. This plasmid, termed pGAD-Lylllb, 

"11 15 was studied more precisely. 

: ijs 

U EXAMPLE 4: DETERMINATION OF THE SEQUENCE OF THE INSERT 

□ CONTAINED IN THE PLASMTD IDENTIFIED. 

! J Sequencing of the insert contained in the plasmid identified was 

carried out, firstly, using the oligonucleotide SEQ ID NO: 7, which is complementary 
to the sequence GAL4TA, close to the EcoRI site of insertion of the lymphocyte 
cDNA library; then, secondly, using the oligonucleotides SEQ ID NO: 8 to SEQ ID 
NO: 11. which correspond to the sequence of the insert which is obtained during the 
course of the sequencing. The sequence obtained is presented on the sequence SEQ ID 
25 NO: 1. The protein thus identified was referred to as PAP1 (Parkin-Associated Protein 

1). 

Comparison of the sequence of this insert with the sequences which 
are contained in the GENBank and EMBL (European Molecular Biology Lab) 
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databases showed a homology of 25% at the protein level with various members of the 
synaptotagmin family. The synaptotagmins are part of a family of membrane proteins 
which are encoded by at least eleven different genes, which are expressed in the brain 
and other tissues. They contain a single transmembrane domain and two calcium- 
5 regulated domains which are termed C 2 . It is in this domain that the homology 
between the synaptotagmins and the PAP1 protein is found. No other significant 
homology was observed. 

EXAMPLE 5: ANALYSIS OF THE SPECIFICITY OF INTERACTION BETWEEN 
0 THE CENTRAL REGION OF PARKIN AND THE PAP1 PROTEIN. 

To determine the specificity of interaction between the fragment 
corresponding to the PAP1 protein and the central region of parkin, a two-hybrid test 
for specific interaction with other nonrelevant proteins was carried out. To carry out 
this test, we transformed strain L40 with the control plasmids plex9-cAPP or P Lex9- 
L5 HaRasVall2, in place of the plasmid P Lex9-parkin (135-290), which respectively 
encode the cytoplasmic domain of the APP or the HaRasVall2 protein, which are 
fused to the LexA DNA binding domain, and with the plasmid isolated during the 
screening of the two-hybrid library. A (5-Gal activity assay was carried out on the cells 
which were transformed with the various plasmids, to determine a protein-protein 
2 0 interaction. According to the result of the assay, only the yeasts which were 

transformed with the plasmid which was isolated during the screening of the two- 
hybrid library, and with the plasmid P Lex9-parkin (135-290), had a (5-Gal + activity, 
which thus shows an interaction between the central region of parkin and the PAP1 
protein. This interaction thus turns out to be specific, since this fragment of PAP1 does 
2 5 not seem to interact with the cAPP or HaRasVall2 proteins. 

These results thus show the existence of a novel protein, referred to as 
PAP1, which is capable of interacting specifically with parkin. This protein, which is 
related to the synaptotagmins, shows no significant homology with 
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known proteins, and can be used in therapeutic or diagnostic applications, for 
producing antibodies, probes or peptides, or for screening active molecules. 

EXAMPLE 6: CLONING OF THE PAP1 GENE FROM A HUMAN LUNG DNA 
5 LIBRARY 

In order to identify the complete sequence of the human PAP1 gene and characterize 
the existence of variant forms, two elongation approaches were carried out from the 
sequence SEQ ID NO: 1. Two sequences were thus obtained, of 1644 bp and 1646 bp 
10 respectively, comprising an elongation of 330 bp as compared to the sequence SEQ ID 
NO: 1. Nonetheless, analysis of these sequences showed differences in the consensus 
region, which were apparent after translation. Thus an ORF of 420aa is obtained in 
one case and an ORF of 230aa with the other sequence. The protein sequence 
obtained was compared with the known sequences and revealed a 24% homology over 
1 5 the 293 amino acids that overlap with the human synaptogamin 1 (p65)(p21579). The 
function of the synaptogamin 1 can be a regulating role in the membrane interactions 
which occur during the synaptic vesicle traffic in the area of the synapse. The 
synaptogamin binds the acidic phospholipids with a certain specificity. Moreover, a 
calcium-dependent interaction between the synaptogamin and the activated kinase C 
2 0 protein receptors was reported. The synaptogamin can also bind three other proteins, 
which are neurexin, syntaxin and ap2. Given the premature and abrupt disappearance 
of any homology between the sequences identified and the family of synaptogamins, 
the sequence identified may contain a deletion as compared to the natural sequence. 
To verify this hypothesis and validate the sequences, a RT-PCR and sequencing 
2 5 experiment was carried out using the 1644 bp sequence. The sequence obtained 

comprises an ORF of 420aa with a homology with the synaptogamins on the same 
order. 
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In an effort to obtain a larger sequence and verify whether the sequence obtamed 
could correspond to a form of splicing, a 5'-RACE elongation experiment was begun 
at the 3' region of the validated sequence, using the LI andL2 oligonucleotides on a 
human lung cDNA preparation. 

The results obtained appear in Figure 2 and show the identification of 8 clones 
corresponding to 6 different 5' terminal ends. Three of these contain a stop codon 
which interrupts the ORF (clones A12, F2, F12) and clone A3 contains no ORF. The 
presence of various transcripts was confirmed by RT-PCR and nested RT-PCR (Table 

Table 1 



RT-PCR 



Primary 



Secondary 
PCR 
U3-L3 



Secondary 
PCR 
U1-L4 



Secondary 
PCR 
C-B 



U3-L3 



A-L4 



170 



153 



A-L3 
U1-L4 



U1-L3 



Ul-B 
U2-B 



Smear 
130 



Smear 



415 
515 



Expected size 



170 



130 



+ 
+ 



120 



"The U3-L3 and C-B primer pairs are specific to the common fragment ot the 
sequence, the A and Ul oligonucleotides are specific to the initial sequence and to 
15 clone Cll, the L4 oligonucleotide is specific to the initial sequence and the U2 pnmer 
is specific to clone A3. A second 5'-RACE was carried out with oligonucleotides L3 
and L7 located in the common region of the different clones (Figure 2). The results 
obtained appear in Figures 3 and 4. The presence of different transcripts was 
confirmed by RT-PCR and nested RT-PCR (Table 2). 
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Table 3 



!l_Y1 1 1 _U4 dCAGTTCTGCCTGTTCATC 

LY111 US WCAAAACACAGAGGAGGAG 

LY111~U3 GAATTTGGTCAGTTTAGA GG 

LY111~L7 mCTGGGATTTGGAGAGCTTTTTCAC 

LY111~L6 TCTGTCTGTCCCACACACTGCC 

LY111~L3 GACTGGCTCCGTCTCTCTG 

L Y1 1 1~C AAGCAACAGAATCTCCCATCC 

LY111~B dCATTGTCAAAATTGCCCATC 

LY1 1 1~E AGGCGGAGAAATACGAAGAC 

LY1 1 1~*D GCAGAGTGAGACAGCCCTTAAC 

Lv111~L2 CJTCCTCAGGACTGGCGACTTCAG 

LV1 1 1~L1 CMGCGGTCGTTCATTCCAAAGAG 

LY1 1 1~ F AAGAGG AG ATAACCCACC AGAG 







SEQID 


23 


to 41 


16 


319 


to 338 


17 


759 


to 778 


18 


851 


to 825 


19 


914 


to 892 


20 


92* 


to 910 


21 


1029 


to 1049 


22 


1147 


to 1127 


23 


1543 


to 1562 


24 


1767 


to 1746 


25 


1811 


to 1782 


26 


1954 


to 1913 


27 


2288 


to 2269 


28 



Table 4 
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LY111. 
LY111. 
LY111. 
LY111. 
LY111. 

LY111. 

LY111. 

Lyill. 

Ly111. 

LY111 



A — TCGT AG AG CAGC AGGTCC AAG 
U1 AGGGCTGCTGGCTATTTTTC 
L4 TAAGAAATGGGTTGTGAAC 
"c AAGCAACAGAATCTCCCATCC 
"b GCATTGTCAAAATTGCCCATC 
"E AGGCGGAGAAATACGAAGAC 
"D GCAGAGTGAGACAGCCCTTAAC 

"L2 cttcctcaggactggcgacttcag 
"li caagcggtcgttcattccaaagag 
"f aagaggagataacccaccagag 



14 

36 

148 

1029 

1147 

1543 

1767 

1811 

1934 

2288 



to 55 
to 166 
to 1049 
to U27 
to 1562 
to 1746 
to 1782 
to 1913 
to 2269 



46 

29 

30 

31 

32 

33 

34 

35 

36 

37 
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All of these results make it possible to validate the consensus sequence which 
corresponds to the long isoform (Figure 9, SEQ ID NO: 12 and 13) and the short 
isoform (Figure 10, SEQ ID NO: 14 and 15) of the PAP1 protein which was identrfied 
from human lung. This protein is also referred to in the following examples as Lylll. 
The long isoform is encoded by an ORF of 1833 bp, located at residues 237-2069 of 
SEQ ID NO: 12 and comprises 610 amino acids. The polyadenylation signal is 
located from nucleotide 2315. The short isoform is encoded by an ORF of 942 bp, 
located at residues 429-1370 of SEQ ID NO: 14, and comprises 3 13 amino acids. The 
polyadenylation signal is located from nucleotide 1616. 

Northern blot experiments were then performed on various human tissues with probes 
(amplimer CD and E-F) and made it possible to reveal a 6 kb transcript in the muscle, 
a transcript in the heart (3 kb), as well as a 6 kb transcript in the fetal liver. In 
addition, Example 7 describes the cloning of a transcript in the human fetal brain. 
Various homology studies were carried out in different protein databases and the 
results thereof are presented in Table 5, below. 

Table 5 



Library 


Homology 


Genpeptll6 


G5926736 (AB025258) granuphilin-a 

Identity: 31% (215/679), Homology (POS): 46% (322/679) 




G5926738 (AB025259) granuphihn-b 

Identity: 31% (150/479), Homology (POS): 47% (230/479) 




G 1235722 (D70830) Doc2 beta (homo sapiens) 
Identity: 25% (74/292), Homology (POS): 43% (127/292) 




G289718 (L15302) Synaptogamin-I 

Identity: 26% (77/293), Homology (POS): 45% (133/293) 


Swissprot 


SP:SYTI_CAEEL Synaptogamin I 

Identity: 26% (77/293), Homology (POS): 45% (133/293) 




SP:SYT2_MOUSE Synaptogamin II 

Identity: 24% (72/293), Homology (POS): 44% (131/293) 
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EXAMPLE 7: CLONING OF TWO FULL-LENGTH PAP1 (LY1 1 IB) 
TRANSCRIPTS FROM COMPLEMENTARY HUMAN FETAL BRAIN DNA 

5 In order to confirm the presence of a full-length Lyl 1 lb transcript in the 

human brain, a PCR was performed from complementary DNA taken from human 
fetal brain (Marathon Ready cDNA, Clontech), using the oligonucleotides LyFl (AAT 
GGA AGG GCG TGA CGC, Figure 5, SEQ ID NO: 38) and HA7 1 (CCT CAC GCC 
TGC TGC AAC CTG, SEQ ID NO: 39) as primers. A DNA fragment with low 
10 representation of approximately two kilobases was amplified. The product of this first 
PCR served as a matrix for a nested PCR, carried out with oligonucleotides LyEcoF 
(GCACGAATTC ATG GCC CAA GAA ATA GAT CTG, SEQ ID NO: 40) and 
HA72 (CTG TCT TCG TAT TTC TCC GCC TTG, SEQ ID NO: 41). The amplified 
products were digested with the restriction enzymes EcoRI (integrated into the 
15 oligonucleotide LyEcoF) and BstEII (Figure 5) and inserted into the expression vector 
pcDNA3, then their sequence was determined. Analysis of the clone sequences 
obtained revealed the presence of two potential full-length Lyl 1 lb transcripts in the 
human fetal brain (Figure 5). The first of these transcripts (Lyl 1 lb^ corresponds to 
the mRNA which was identified in the human lung (Example 6) and encodes a 
20 protein of 609 amino acids (pLylllb fanA ; Figures 5,6, SEQ ID NO: 42-43). The 

second (Lyl 1 lb™) probably represents an alternative splicing product of a common 
primary mRNA. In this transcript, which is identical to Lyl 1 lb™*, the sequence 
between nucleotides 752 and 956 of the sequence validated in the human lung is 
absent (SEQ ID NO: 42). Lyl llb fullB thus encodes a protein of 541 amino acids 

2 5 (pLy 1 1 lb™) which is identical to pLy 1 1 W, in which, however, the domain 

included between amino acids 172 and 240 (Figures 5,7, SEQ ID NO: 44-45) comes to 
be missing. The two proteins pLyl 1 lb™*.™ integrate into the domain of interaction 
with the fragment of parkin that comprises amino acids 135 to 290, which were 
identified in the yeast (initial sequence Lyl lib Figure 5), and can therefore 

3 0 theoretically maintain this interaction. 
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The pLylllb^a proteins belong to the RIM/Rabphiline family 

pLylllb^B shows a homology with the proteins of the RIM/Rabphiline 
family (Wang Y. Sugita S & Sudhof TG. The RIM/NIM Family of Neuronal C2 
5 Domain Proteins. J Biol Chem (2000) 275.20033-20044) and in particular with the 
granulophilins (Wang He, Takeuchi T. Yokota H & Izumi T. Novel Rabphilin-3-hke 
Protein Associates with Insulin-containing Granules in Pancreatic Beta Calls. J BuA 
Chem (1999) 274, 28542-28548). They are characterized by the presence of a zmc 
finger domain in the N-terminal part of the two C 2 domains, in the C terminal part 
10 (Figures 6 and 7). The zinc finger domain of the proteins from the RIM/Rabphiline 
family was involved in the interaction with the Rab proteins. These Rab protems, 
which bind GTP, are compounds which are essential to the machinery of membrane 
traffic in the eukaryotic cells. Moreover, it has been described that the C 2 domains of 
the proteins from the RIM/Rabphiline family can bind membranes by interacting with 
1 5 phospholipids. 

Expression of the P Ly 1 1 1 Wn. proteins in the cells of the cos-7 line: co- 
localization with parkin 
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The coding sequence of the Ly 1 1 1 W. transcripts was inserted into the eukaryotic 
expression vector pcDNA3 in phase with the sequence which encodes a myc N- 
terminal epitope (pcDNA3-mycLylllb m ). Cells from the cos-7 line which are 
transfected using these vectors produce proteins with an apparent molecular we.ght of 
approximately 67 kDa (pcDNAS-mycLylllb^) and 60 kDa (pcDNA3- 
mycLy 1 1 lfcum), which corresponds to the expected molecular weight. These 
proteins, which were detected via immunolabelling, using an antibody directed agamst 
the N-terminal myc epitope, are distributed in the cytoplasm, the extensions and at 
times the nucleus of the cos-7 line of cells in a non-homogenous, punctate manner 
(Figure 8a, b, column A). When these proteins are overexposed with parkin and 
3 0 revealed using the 
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Asp5 anti-parkin antibody in the cells of line cos-7 (Figure 8a, b, column B) a sir 
distribution pattern and a co-localization of these proteins can be observed (Figui 
b, column C). 
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CLAIMS 



1 Compound capable of modulating, at least partially, the 
interactionbetweenthePAPl protein, or a homolog of this protein, and parkin. 

2 . compound according to claim 1, characterized inthat it slows, 

inhibits or stimulates, at least partially, said interaction. 

3 Compound according to either of claims 1 and 2, characters! 
inthat itis capable ofbindmgmedom^nof mteraction between the PAP 1 protein, or a 

homolog of this protein, and parkin. 

4 Compound according to one of claims 1 to 3, charactenzed in 

5 Compound according to claim 4, characterized in that it is a 
peptide compound comprising all or part of the peptide sequence SEQ ID NO: 2 or a 

derivative thereof. . . 

6 Compound according to claim 4, characterized in that it is a 

peptide compound comprisingaregionofwhichthe sequence corresponds to all or a 
fictional part ofthe site of interaction of the PAPl protein with parloa 

7. Compound according to claim 4, characterized in that ft » a 
peptide compound which is derived from the PAP1 protein (and/or from the 
homologous forms), and which bears an effector region which has been made 
nonfunctional. 

8. Polypeptide comprising the sequence SEQ ID NO: 2 or a 

derivative or fragment of this sequence. 

9 Polypeptide according to claim 8, comprising at least 5 
consecutive residues of the sequence SEQ ID NO: 2, preferably at least 9, more 



preferably at least 15. 

10 Polypeptide according to claim 8, comprising all or part ot 

sequence SEQ ©NO: 13, 15, 43 or 45 or of a 

at least 5 consecutive residues, preferably at least 9, more preferably at least 15 
30 consecutive residues of the sequence SEQ ID NO: 13, 15, 43 or 45. 
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U. Nucleic acid encoding a peptide compound according to one of 

claims 4 to 10. 

1 2. Nucleic acid according to claim 1 1 , characterized in that it 
comprises all or part of the sequence SEQ ID NO: 1, 12, 14, 42 or 44, or a sequence 

5 which is derived from these sequences. 

1 3. Nucleic acid encoding a polypeptide according to claim 8 or 1 1 . 

14. Nucleic acid, in particular a nucleotide probe, which is capable 
of hybridizing with a nucleic acid according to one of claims 1 1 to 13, or with their 
complementary strand. 

10 is. Vector comprising a nucleic acid according to one of claims 11 

to 14. 

16. Recombination-defective virus comprising a nucleic acid 

according to one of claims 1 1 to 1 4. 

17. Nucleic acid chosen from among the nucleic acids of sequence 

15 SEQ ID NO: 16-41, 46. 

18. Antibody or antibody fragment or derivative, characterized in 

that it is directed against a peptide compound according to one of claims 4 to 10. 

1 9. Antibody according to claim 1 8, characterized in that it 
recognizes a polypeptide according to claim 9 or 10. 

2 0 20. Pharmaceutical composition comprising at least one compound 

according to one of claims 1 to 10, or an antibody according to claim 18 or 19. 
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21 Nonpeptide compound or a compound which is not of 
exciusively peptide nature, which is capable of modulating, at least partially, the 
^teractionofthePAPlprotein.orahomologofthisproteir.wimpark.n. 

22 Compound according to claim 21, characterized in that the 

stmct ure W Mchisnotapep^ 

23. Pharmaceutical composition comprising at least one nuclei acd 

f .io; m c n to \ 4 or one vector according to claim 1 5 or 1 6. 
according to one of claims 1 1 to i% or one 

24. Pharmaceutical composition comprising a peptide compound 

according to one of claims 4 to 10. 

25. Composition according to claim 22, 23 or 24, mtended for 

treating neurodegenerative pathologies. 

26 Method for screening or for characterizing active molecules, 
whichcomprisesastep of selecting molecules which are capable ofbinding the 
sequence SEQ ID NO: 2 or the sequence SEQ ID NO: 4, or a fragment of these 

sequences. m A 

27 Melhod for scrcentag or for characterizing active molecules, 

v.Mchco^as^ofsel^molecu^vvhich^cap.bleofbWi.gaseo.uer.e 
!0 chose„fnx„ar„o„gSEQ I DNO : . 3 ,.5,43a„d45ora»a 9 nen,of m esese q uence S 

28 Method for producing a peptide compound according to one of 

d ^4.ia«^*^-'^'** a -* ,, - t, *" lM r? 1 

, 0 „„e„fclaims 11 to 14 or a vector acccrdmgto claim 15or l^ercondthons to 
expressing said nucleic acid, and the recovery of the peptide compound produced. 
29. Human PAP1 protein in isolated form. 



42 



30. Cell which contains a nucleic acid according to one of claims 1 1 

to 14 or a vector according to claim 15 or 16. 

31. A non-human mammal which comprises in its cells a nucleic 

acid according to one of claims 1 1 to 14. 




ABSTRACT 

The present invention relates to novel compounds and their uses, in 
particular their pharmaceutical or diagnostic uses or their use as pharmacological 
targets. More particularly, the present invention relates to a novel protein, referred 
to as PAP1, as well as to novel peptides and compounds which are capable of 
modulating, at least partially, the activity of parkin. 
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SEQUENCE LISTING 



<110> AVENTIS PHARMACEUTICALS, INC. 

<12 „> COMPOSITIONS THAT CAN BE USED FOR PESULATINC THE ACTIVITY 0, PARKIN 

<130> ST00005 

<140> 
<141> 

<160> 46 

<170> Patentln ver. 2.1 

<210> 1 
<211> 1313 
<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (D..C1032) 

<400> 1 ^ „ n nra rrc aQt acc ata ttc tct gga ggt 48 

96 



fs; s s ser its ss R ss ss ffi as s ss aj as 
isi a? - a; ss is if. ss as ss ss - ss 1 as s 

20 " 

a; SS SS BS SS RS SS «§ « « « - « ffi K K 

sr. ss ss ss ss s as ss a as as ss as s as ss 

50 55 

5! ss ss; bs ss w as as ss " 9 s i ss ss sss a $g 

65 70 

«a ss as sss ss 3 s ss ss ss ss as as ss 1 ss 

sj a? us as ss ss tSS ?s? sj ss - sss as m ?ss sss 

100 iUD 

- $ as «a as ss as as ss «a ?ss si as ss as 

115 1ZU 

s ?a s?s as sss m w ss ss ss ss sa sss sss a; as 

130 135 



144 

192 

240 

288 

336 

384 

432 
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sa s ss a ss is is ss ss as r P ss ?s *s ,j 

145 150 

ss ss s w ss a a ss rs i w « - - i - 

165 

£ ffi s as ss s w as i s ® a s 1 3 ® 
s s s a ss « s: a as R bs as as as s s 



195 



205 



eag cca tea ctt cat ggt caa ctt tgt ttg £. gtg eta gga gee aag 

Gin Pro ser Leu His Gly Gin Leu <_yb lc 22Q 
210 Z1:> 

c 

y 

240 



aat tta cct 
Asn Leu Pro 
225 



53 38 SS 35 S8 SE SS SS SSi 3 | 

230 

i a ffi a i k as an ss a s a w s ss «h 

a Sy' as ss & - as fj s; as - ss f 0 as ss 

260 

i% s s ss rj as a i as ss s a | ss s jb 

275 Z8U 

s s as k a ss | a ss ss ss a a w a; s 

290 Z9b 

j?s a sr, ss ss aj ss ?s si «s f a? s ss $ ss 

305 310 

ct a teg aag etc cag tgg cag aaa gtc ctt tee agx eee aat eta m 

Leu ser Lys Leu Gin Trp Gin Lys v* ^ 335 

aca gac atg act ctt gtc ctg cac tgacatgaag gectcaaggt tccaggttgc 
Thr Asp Met Thr Leu val Leu His 
340 

ageaggegtg aggcactgtg egtetgeaga ggggetaega accaggtgea gggteeeagc 
tggagaecec tttgaeettg agcagtctce atetgeggee etgtcccatg gettaaecge 
etattggtat ctgtgtatat ttaegttaaa eaeaattatg ttacctaage etctggtggg 
ttatetectc tttgagatgt agaaaatgge eagattttaa taaaegttgt taeeeatgaa 

aaaaaaaaaa a 



480 
528 
576 
624 
672 
720 
768 
816 
864 
912 
960 
1008 



1062 

1122 
1182 
1242 
1302 
1313 
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<210> 2 
<211> 344 
<212> PRT 

<213> Homo sapiens 



cTisn Leu Pro Ser ser Pro Ala Pro ser Thr He Phe ser Cly Gly 

Phe Arg His Gly ser Leu He ser He Asp ser Thr cys Thr c,u »et 

Gly Asn Phe TP Asn Ala Asn val Thr Gly Glu He Glu Phe AT He 

His T yr cys Phe Lys Thr His ser Leu Glu He cys He Lys Ala Cys 

50 " 
Lys as„ Leu Ala Tyr cly clu clu Lys Lys Ly,s Lys cys Asn Pro Tyr 

vll Lys Thr Tyr Leu Leu Pro Asp Arg ser ser cm cly Lys Arg Lys 

T hr Cly va! Gin Ar g Ash Thr val Asp Pro Thr Phe cln clu Thr Leu 

Lys T yr Cln vll Ala Pro Ala Gin Leu val Thr Arg cln Leu cln val 

r-u, Thr Leu Ala Arg Arg val Phe Leu Gly Glu 
ser val Trp His Leu Gly Thr Leu Aia Ary y ^ 

130 

(1 , Thr TrD asp phe Glu Asp ser Thr Thr Gin 
val He He Ser Leu Ala Thr Trp Asp ^ k 160 

145 

ser Phe Arg Trp His Pro Leu Arg Ala Lvs Ala clu Lys Tyr clu Asp 

16 b 

ser val Pro Cln ser Asn cly clu Leu Thr val Arg Ala Lvs Leu val 

180 

Leu Pro ser Arg Pro Arg Lys Leu cln clu Ala cln clu cly Thr Asp 

195 zuu 

u -c riv rln Leu cys Leu val val Leu Gly Ala Lys 
Gin Pro ser Leu His Gly Gin Leu cyb nQ 

210 £ ^ -. 

Asn Leu Pro val Arg Pro Asp cly Thr Leu Asn ser Phe val Lys cly 

Z Leu Thr Leu Pro Asp cln cln Lys Leu Arg Leu Lys ser Pro val 

Leu Arg Lys cln Ala cys Pro cln Tro Lys His ser Phe val Phe ser 

c ly val Thr To Ala cln Leu Arg cln ser ser Leu clu Leu Thr val 

Trp asp Cln Ala Leu Phe cly „et Asn Asp Arg Leu Leu cly cly Thr 

290 ^ 

Page 3 
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Arg Leu Gly Ser Lys Gly Asp Thr Ala Val Gly Gly Asp Ala cys ser 
305 310 

Leu Ser Lys Leu Gin Trp Gin Lys val Leu Ser Ser Pro Asn Leu Trp 
325 330 

Thr Asp Met Thr Leu Val Leu His 
340 



<210> 3 
<211> 471 
<212> DNA 

<213> Homo sapiens 

<220> 

<221> CDS 

<222> (D..C471) 



ipf r x ss i% i% ss s; $ ss sss $ ?a ^ $ 

S! 35 $ S Arg 83 3S "r§ S S5 iS fj 38 $ 

20 25 30 

anr acc tac aaa caq gca acq etc acc ttg acc cag gat cca tct tgc 
sir Thr cys A?g Gin Ala Thr Leu Thr Leu Thr Gin G?y Pro Ser Cys 

35 40 4:> 

tgg gat gat gtt tta att cca aac egg atg agt ggt gaa tgc caa tec 
Trp Asp Asp val Leu He Pro Asn Arg Met ser cTy Glu Cys Gin Ser 
50 55 bU 

cca cac tac cct ggg act agt gca gaa ttt ttc ttt aaa tgt gga gca 
Pro His cys Pro G?y Thr Ser Ala Glu Phe Phe Phe Lys cys Gly Ala 
65 70 75 

cac ccc acc tct gac aag gaa aca tea gta get ttg cac ctg ate gca 
His Pro Thr sir Isp LyI Glu Thr Ser val Ala Leu His Leu lie Ala 



96 



144 



192 



240 



288 



336 



384 



ara aat aat caa aac ate act tgc att acg tgc aca gac gtc agg age 
Th? lln llr Arl aIh He Thr C?s lie Thr Cys Thr Asp val Arg Ser 
100 105 

ccc ate eta att ttc cag tgc aac tec cgc cac gtg att tgc tta gac 
Pro val Leu val Phe Gin cys Asn ser Arg His val lie Cys Leu Asp 
115 120 

tat ttc cac tta tac tgt gtg aca aga etc aat gat egg cag ttt gtt 432 
cys lie His Leu Tyr cys vaT Thr Arg Leu Asn Asp Arg Gin Phe val 
130 135 I 40 

cac gac cct caa ctt ggc tac tec ctg cct tgt gtg tag 471 
His Asp Pro Gin Leu Gly Tyr Ser Leu Pro cys val 
145 150 155 



<210> 4 
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<211> 156 
<212> PRT 

<213> Homo sapiens 



G^Vr Pro Ala Gly Arg ser lie Tyr Asn ser Phe Tyr val Tyr Cys 
Ly \ G ly Pro cys Gin Arg va! Gin Pro Gly Lys Leu Arg val Gin cys 
ser Thr cys Arg Gin Ala Thr Leu Thr Leu Thr Gin Pro ser cys 
Trp asp Asp val Leu He Pro Asn Arg Met ser Gly Glu cys G n ser 
Pro h52 cys Pro G ly Thr Ser Ala g!u Phe Phe Phe Lys cys Gly A a 
„?| Pro Thr ser Asp Ly? Glu Thr ser val Ala Leu His Leu lie Ala 
Thr Asn ser Arg Asn He Thr Cys lie Thr cys Thr Asp val Arg ser 

Pro val Leu val Phe Gin cys Asn & Arg His val xle Cys Leu Asp 

cys Phe S Leu Tyr cys val S°r Arg Leu Asn Asg Arg Gin Phe val 

His Asp p™ Gin Leu Gly Tyr ser Leu Pro cys val 

145 150 



<210> 5 
<211> 27 
<212> DNA 

<213> Artificial sequence 
<223> Description of the arti 
ttaagaattc ggaagtccag caggtag 



ficial sequence: Oligonucleotide 



27 



<210> 6 
<211> 29 
<212> DNA 

<213> Artificial sequence 



<llt Description of the artificial sequence: oligonucleotide 



attaggatcc ctacacacaa ggcagggag 



29 



<210> 7 

<211> 19 

<212> DNA . 

<213> Artificial sequence 

<223> Description of the arti 
<400> 7 

gcgtttggaa tcactacag 



ficial sequence: Oligonucleotide 



Page 5 
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<210> 8 
<211> 17 
<212> DNA 

<213> Artificial sequence 



<220> . f , r1 Hf-irial seauence: Oligonucleotide 

<223> Description of the artinciai bequei... 

<400> 8 17 
ggtctcggtg tggcatc 

<210> 9 
<211> 18 
<212> DNA 

<213> Artificial sequence 

Zlt Description of the artificial sequence oligonucleotide 

<400> 9 18 
ccgcttgctt ggaggaac 

<210> 10 

<211> 18 

<212> DNA . 

<213> Artificial sequence 

<220> • • ~f arrifirial sequence: oligonucleotide 

<223> Description of the artinciai iequc 

<400> 10 18 
cgtatttctc cgccttgg 

<210> 11 
<211> 28 
<212> DNA 

<213> Artificial sequence 

tilt Description of the artificial sequence Oligonucleotide 

<400> 11 ,„ 28 

aatagctcga gtcagtgcag gacaagag 

<210> 12 
<211> 2347 
<212> DNA 

<213> Homo sapiens 

< 400> 12 tnrran1 . trt ncctattcat ctggaacctg gatctaagga 60 

qgccttgggg cactgaggga tgccagttct j£"9""£ cccaqccagg tattgaacgg 120 
SggaagaggE gttgcccctg "ggcatagt caggtaccag cccagc gg aggg « gtgag ^0 
gctgagcttt tcatgatggt tcctgctgac "ggaaa cgaaaacgga gaagaaatgg 240 
cqcttggtcc atgcagtgaa gctcttccaa ctMjyy;- * qccatt ctccaggtcc 300 

;«S!S8S SS8SSB ««ng ™« 4so 



si mi ins it liii mi i 

?g?aatctta tcagaagctg agcaaaattt "gtggttcc ttta ?*?K Ifo 

gcgagagcca gtgcagccgc ^gtcctggca ggxx yy cca cgtgaaa aagctctcca 840 

SS3SS «S ESSE 0 

«8 si SSSES ^ | 

SSESSS3 SSSSSS s «« 

Icaatgitaa tgtcactgga gaaatagaat "gccati tggagaagaa aagaagaaaa 1260 
ctttagaaat atgcatcaag 9"tgtaaga accttgcc™ TO" ggaaagcgca 1320 
agtgclatcc gtatgtgaag ?«tacctgt tE£«gacag ttg aa gtatcagg 1380 

feS S I « ? f- gags m 
fes SSK ? |s S3S888 -"S 

tttgtttggt agtgctagga Qccaagaatt tacctgty y « ctg aagtcgccag 1800 

dsss sss£ - Esss 3SSSS 
ssfe ssass I II Ss fes sssss 

aaqatgcatg ctcacaatcg aagctccagt 99cagad*y aaggttcca ggttgcagca 2100 

issss s» 1 I sabs sssssr 

gacccctttg accttgagca Jtctccatct gwecctax »» tct ggt gggttat 2280 
effflg EUSSS Staataaa cgttgttacc catgaaaaaa 2340 



aaaaaaa 



<210> 13 
<211> 610 
<212> PRT 

<213> Homo sapiens 



^la'cln clu fl. Asp Leu ser Ala Leu Lys clu Leu clu Arg clu 

Ala He Leu cm val Leu Tyr Arg Asp Gin Ala val Gin Asn Tnr 61„ 

20 " 
6 ,u Glu Ar ? Thr Arg Lys Leu Lys Thr His Leu Gin Hi. Leu Arg Trp 

Lys s1y 1 Lys Asn Thr Asp Trp Glu His Lys Glu Lys cys cys Ala 

50 bb 
Arg cys Gin Gin val Leu Gly Phe Leu Leu His Arg Gly Ala val Cvs 

Arg Gly cys Ser His Arg val cys Ala Gin cys Arg v.! Phe Leu Arg 

sly Thr His Ala Trp Lys cys Thr val cys Phe Glu Asp jrj Asn val 

100 x 
Lys lie Lys Thr cly clu Trp Phe Tyr clu Clu Arg Ala Lys Lys Phe 
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Pro Thr Gly Gly Lys His Glu Thr val Gly Gly Gin Leu Leu Gin sen 

130 135 
Tyr Gin Lys Leu Ser Lys He ser val val Pro Pro Thr Pro Pro Pg 

145 150 

val ser Glu ser Gin cys ser Arg ser Pro Gly Arg Leu Gin Glu Phe 



165 



Gly Gin Phe Arg Gly Phe Asn Lys ser val Glu Asn Leu Phe Leu Ser 



180 



L eu Ala Thr His val Lys Lys Leu ser Lys ser Gin Asn Asp Met Thr 

195 200 
ser Glu Lys His Leu Leu Ala Thr Gly Pro Arg Gin cys val Gly Gin 

210 215 
Thr Glu Arg Arg ser Gin ser Asp Thr Ala val Asn Val Thr Thr Arg 
225 230 

Lys val ser Ala Pro Asp He Leu Lys Pro Leu Asn Gin Glu Asp Pro 



245 



Ly5 cys ser Thr Asn Pro He Leu Lys Gin Gin Asn Leu Pro Ser Ser 

7 260 2bb 

Pro Ala Pro ser Thr He Phe Ser Gly Gly Phe Arg His Gly ser Leu 

275 280 
He ser He Asp ser Thr Cys Thr Glu Met Gly Asn Phe Asp Asn Ala 

290 295 
A sn val Thr Gly Glu lie Glu Phe Ala He His Tyr cys Phe Lys Thr 
305 310 

His ser Leu Glu lie Cys He Lys Ala Cys Lys Asn Leu Ala Tyr Gly 

Glu Glu Lys Lys Lys Lys cys Asn Pro Tyr val Lys Thr Tyr Leu Leu 

340 iH0 

c ^ ri„ riv iv; Ara Lvs Thr Gly val Gin Arg Asn 
pro Asp Arg Ser ser Gin Gly Lys Arg Lys .. y ^ 



355 



Thr val Asp Pro Thr Phe Gin Glu Thr Leu Lys Tyr Gin val Ala Pro 

Ala Gin Leu val Thr Arg Gin Leu Gin val ser val Trp His Leu Gly 
385 390 

Thr Leu Ala Arg Arg Val Phe Leu Gly Glu val He He Pro Leu Ala 

Thr Trp Asp Phe Glu Asp ser Thr Thr Gin Ser Phe Arg Trp His Pro 

420 ^" 
Leu Arg Ala Lys Ala Glu Lys Tyr Glu Asp ser val Pro Gin ser Asn 



435 "40 

Leu 

450 «=> Page * 



Gly G] u Leu Thr val Arg Ala Lys Leu val Leu PrO Q ser Arg Pro Arg 
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L ys Leu Gin Glu Ala Gin Glu Gly Thr Asp Gin Pro ser Leu His Gly 



465 



Gin Leu cys Leu val val Leu Gly Ala Lys Asn Leu Pro val Arg Pro 

485 

Asp 6 ly Thr lju Asn ser Phe val Lys Gly cys Leu Thr Leu Pro Asp 
Gin Gin Lys Leu Arg Leu Lys ser Pro val Leu Arg Lys Gin Ala cys 



515 



Pro Gin Trp Lys His ser Phe val Phe Ser Gly val Thr Pro Ala Gin 

530 535 
Leu Arg Gin ser Ser Leu Glu Leu Thr val Trp Asp Gin Ala Leu Phe 



545 



550 



Gly Met Asn Asp Arg Leu Leu Gly Gly Thr Arg Leu Gly Ser Lys Gly 

Asp Thr Ala val Gly Gly Asp Ala cys Ser Gin ser Lys Leu Gin Trp 
580 585 

Gin Lys val Leu ser ser Pro Asn Leu Trp Thr Asp Met Thr Leu val 

595 600 UUJ 

Leu His 
610 



<210> 14 
<211> 1648 
<212> DNA 

<213> Homo sapiens 



<400> 14 

gaaatcatgc 

agtgaggcag 

gtatttgtaa 

ttggcagcaa 

gccgcagtcc 

atcccaaatg 

ccagtaccat 

gtacagagat 

attattgctt 

atggagaaga 

gatcctccca 

aggagacctt 

cggtgtggca 

tggccacgtg 

ccaaggcgga 

ctaagctggt 

agccatcact 

ggccagatgg 

aactgagact 

ttgtcttcag 

gggatcaggc 

agggagacac 

tcctttccag 

tcaaggttcc 

aggtgcaggg 



ccctcgtaga 
tttaaaaaaa 
aactaacggc 
aatttctgtg 
tggcaggaag 
ctctactaac 
attctctgga 
gggcaatttt 
caaaacccat 
aaagaagaaa 
gggaaagcgc 
gaagtatcag 
tctgggcacg 
ggactttgaa 
gaaatacgaa 
tctcccttca 
tcatggtcaa 
caccttgaac 
gaagtcgcca 
tggcgtaacc 
cctctttgga 
agctgttggc 
ccccaatcta 
aggttgcagc 
tcccagctgg 



gcagcaggtc 
aggcggagaa 
ttgcatggtt 
gttcctccta 
gtcagtgcac 
cctattttga 
ggttttagac 
gacaatgcta 
tctttagaaa 
aagtgcaatc 
aagactggag 
gtggcccctg 
ctggcccgga 
gacagcacaa 
gacagcgttc 
cggcccagaa 
ctttgtttgg 
tcatttgtta 
gtcctgagga 
ccagctcagc 
atgaacgacc 
ggggatgcat 
tggacagaca 
aggcgtgagg 
agaccccttt 



caagcagggc 
ctagaattat 
cacaacccat 
ctccacctcc 
cagatattct 
agcaacagaa 
acggaagttt 
atgtcactgg 
tatgcatcaa 
cgtatgtgaa 
tccaaaggaa 
cccagctggt 
gagtgtttct 
cacagtcctt 
ctcagagtaa 
aactccaaga 
tagtgctagg 
agggctgtct 
agcaggcttg 
tgaggcagtc 
gcttgcttgg 
gctcacaatc 
tgactcttgt 
cactgtgcgt 
gaccttgagc 
page 



tgctggctat 
agaataatgg 
ttcttatgcc 
tgtcagcgag 
gaaacctctc 
tctcccatcc 
aattagcatt 
agaaatagaa 
ggcctgtaag 
gacctacctg 
caccgtggac 
gacccggcag 
tggagaagtg 
ccgctggcat 
tggagagctc 
ggctcaagaa 
agccaagaat 
cactctgcca 
cccccagtgg 
gagcttggag 
aggaaccaga 
gaagctccag 
cctgcactga 
ctgcagaggg 
agtctccatc 
9 



ttttccaaaa 60 
cacattttgt 120 
tgtgttttcc 180 
agccagtgca 240 
aatcaagagg 300 
agtccggcac 360 
gacagcacct 420 
tttgccattc 480 
aaccttgcct 540 
ttgcccgaca 600 
ccgacctttc 660 
ctgcaggtct 720 
atcattcctc 780 
ccgctccggg 840 
acagtccggg 900 
gggacagatc 960 
ttacctgtgc 1020 
gaccaacaaa 1080 
aaacactcat 1140 
ttaactgtct 1200 
cttggttcaa 1260 
tggcagaaag 1320 
catgaaggcc 1380 
gctacgaacc 1440 
tgcggccctg 1500 



st0005seq «.«.,*„**. a mfin 

sags ssssss sass* ssass ssss sskss s» 

acgttgttac ccatgaaaaa aaaaaaaa 

<210> 15 
<211> 313 
<212> PRT 

<213> Homo sapiens 



5e?°Gly 5 Asn Phe Asp Asn Ala Asn val Thr Gly Glu He Glu Ph. Ala 
111 His Tyr cy.s Phe Lys Thr His ser Leu Gl» 11. cys xle Lys >1. 
cys Lys Asn 1 Ala Tyr Gly Glu tflu Lys Lys Lys Lys cys Asn Pro 

Tyr val l" Thr Tyr Leu Leu Pro Asp Arg Ser ser Gin Gly Lys Arg 

50 55 
Lys Thr Gly val Gin Arg Asn Thr val Asp Pro Thr Phe Gin Glu Thr 
65 70 

Leu Lys Tyr Gin val Ala Pro Ala tin Leu val Thr Arg Gin Leu Gin 

val ser val Trp His Leu sly Thr Leu Ala Arg Arg val Phe Leu Gly 

100 1Ub 
G1 „ val lie He Pro Leu Ala Thr Trp Asp Phe Glu Asp Ser Thr Thr 

115 120 
Gin ser Phe Arg Trp His Pro Leu Arg Ala Lys Ala Glu Lys Tyr Glu 

130 135 
A5 p ser v,l Pro Gin ser Asn Gly Glu Leu Thr val Arg Ala Lys Leu 
145 150 

val Leu Pro ser Arg Pro Arg Lys Leu Gin Glu Ala Gin Glu Sly. Thr 

165 

Asp cin Pro ser Leu His Gly Gin Leu Cys Leu val val Leu Gly Ala 

Lys Asn Leu Pro val Arg Pro Asp Gly Thr Leu Asn ser Phe val Lys 

Gly cys Z Thr Leu Pro Asp Gin Gin Lys Leu Arg Leu Lys ser Pro 

210 21b 
val Leu Arg Lys Gin Ala cys Pro Gin Trp lj. His ser Phe val Phe 

225 230 

ser Gly val Thr Pro Ala Gin Leu Arg Gin Ser ser Leu Glu Leu Thr 

val Trp asp Gin Ala Leu Phe Gly Met Asn Asp Arg Leu Leu Gly Gly 

260 

Thr Arg Leu Gly ser Lys Gly Asp Thr Alajal Gly Gly Asp Ala cys 



st0005seq 



275 



280 



285 



ser dn ser Lys La, dn Trp cl, Lys vl Leu Ser ser Pro Asn Leu 

290 l ^ 

Trp Thr Asp Met Thr Leu val Leu His 
305 310 

<210> 16 
<211> 19 
<212> DNA 

<213> Artificial sequence 

<l 2 2 f> Description of the artificial sequence oligonucleotide 

<400> 16 19 
ccagttctgc ctgttcatc 

<210> 17 
<211> 20 
<212> DNA 

<213> Artificial sequence 

<220> • • ar-t-ifirial sequence: oligonucleotide 

<223> Description of the artmciai scmuci.v. 

<400> 17 20 
ttcaaaacac agaggaggag 

<210> 18 
<211> 20 
<212> DNA 

<213> Artificial sequence 

<nf> Description of the artificial sequence -.oligonucleotide 

<400> 18 20 
gaatttggtc agtttagagg 

<210> 19 
<211> 26 
<212> DNA 

<213> Artificial sequence 

<llt Description of the artificial sequence oligonucleotide 

<400> 19 26 
ttctgggatt tggagagctt tttcac 

<210> 20 
<211> 22 
<212> DNA 

<213> Artificial sequence 



<220> 
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<223> Description of the artificial sequence: oligonucleotide 

<400> 20 22 
tctgtctgtc ccacacactg cc 

<210> 21 
<211> 19 
<212> DNA 

<213> Artificial sequence 

^223? Description of the artificial sequence: oligonucleotide 

<400> 21 19 
gactggctcc gtctctctg 

<210> 22 
<211> 21 
<212> DNA 

<213> Artificial sequence 

<Ht Description of the artificial sequence oligonucleotide 

<400> 22 21 
aagcaacaga atctcccatc c 

<210> 23 
<211> 21 
<212> DNA 

<213> Artificial sequence 

fit Description of the artificial sequence oligonucleotide 

<400> 23 21 
gcattgtcaa aattgcccat c 

<210> 24 
<211> 20 
<212> DNA 

<213> Artificial sequence 

fit Description of the artificial sequence:oligonucleotide 

<400> 24 20 
aggcggagaa atacgaagac 

<210> 25 
<211> 22 
<212> DNA 

<213> Artificial sequence 

f 2 23> Description of the artificial sequence oligonucleotide 
<400> 25 page 12 



gcagagtgag acagccctta ac 
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<210> 26 
<211> 24 
<212> DNA 

<213> Artificial sequence 



tilt npscriotion of the artificial sequence oligonucleotide 



<223> Description 

<400> 26 

cttcctcagg actggcgact tcag 



<210> 27 
<211> 24 
<212> DNA 

<213> Artificial sequence 



tilt npscriotion of the artificial sequence: oligonucleotide 



<223> Description 
<400> 27 

caagcggtcg ttcattccaa agag 



<210> 28 
<211> 22 
<212> DNA 

<213> Artificial sequence 



tilt npscriotion of the artificial sequence oligonucleotide 



<223> Description 
<400> 28 

aagaggagat aacccaccag ag 



<210> 29 
<211> 20 
<212> DNA 

<213> Artificial sequence 



tilt nescriotion of the artificial sequence oligonucleotide 



<223> Description 
<400> 29 

agggctgctg gctatttttc 



<210> 30 
<211> 19 
<212> DNA 

<213> Artificial sequence 



tilt nP.criotion of the artificial sequence oligonucleotide 



<223> Description 
<400> 30 

taagaaatgg gttgtgaac 
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<210> 31 
<211> 21 
<212> DNA 

<213> Artificial sequence 

tilt Description of the artificial sequence: oligonucleotide 
<400> 31 

aagcaacaga atctcccatc c 

<210> 32 
<211> 21 
<212> DNA 

<213> Artificial sequence 

^223> Description of the artificial sequence: oligonucleotide 

<400> 32 

gcattgtcaa aattgcccat c 

<210> 33 
<211> 20 
<212> DNA 

<213> Artificial sequence 

<llt Description of the artificial sequence -.oligonucleotide 

<400> 33 

aggcggagaa atacgaagac 

<210> 34 
<211> 22 
<212> DNA 

<213> Artificial sequence 

^23> Description of the artificial sequence oligonucleotide 
<400> 34 

gcagagtgag acagccctta ac 

<210> 35 
<211> 24 
<212> DNA 

<213> Artificial sequence 

illt Description of the artificial sequence:oligonucleotide 

<400> 35 

cttcctcagg actggcgact tcag 



<210> 36 
<211> 24 
<212> DNA 
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<213> Artificial sequence 

<220> . . , . artif-jHal sequence: oligonucleotide 

<223> Description of the artiTiciai «mu 

<400> 36 ^ n 24 

caagcggtcg ttcattccaa agag 

<210> 37 

<211> 22 

<212> DNA . 

<213> Artificial sequence 

<220> . . _ . art ifi c ial sequence: oligonucleotide 

<223> Description of the artinciai « HU 

<400> 37 22 
aagaggagat aacccaccag ag 

<210> 38 
<211> 18 
<212> DNA 

<213> Artificial sequence 

tilt Description of the artificial sequence -.oligonucleotide 

<400> 38 18 
aatggaaggg cgtgacgc 

<210> 39 
<211> 21 
<212> DNA 

<213> Artificial sequence 

tilt Description of the artificial sequence oligonucleotide 

<400> 39 _ 21 

cctcacgcct gctgcaacct g 

<210> 40 
<211> 31 
<212> DNA 

<213> Artificial sequence 

<nf> Description of the artificial sequence oligonucleotide 

<400> 40 +. 31 

gcacgaattc atggcccaag aaatagatct g 

<210> 41 
<211> 24 
<212> DNA 

<213> Artificial sequence 

<220> Page 15 
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■f artificial sequence -.oligonucleotide 
<223> Description of the artificial a h 

<400> 41 24 
ctgtcttcgt atttctccgc cttg 

<210> 42 
<211> 2347 
<212> DNA 

<213> Homo sapiens 

SMflS cactgaggga tgccagttct gggttcat ctgaacctg gtctaaga 60 

gggaagaggc gttgcccctg ctggcatagt caggtaccag cttaaatgga agggcgtgag 180 

gctgagcttt tcatgatggt tcctgctgac "yy cgaaaacgga gaagaaatgg 240 

cgcttggtcc atgcagtgaa Qctcttccaa cctgggt y t ctccaggtcc 300 

cccaagaaat agatctgagt gctctcaagg ^ttag a | c « cggaaa ctgaaaacac 360 

SSSS35S SSS II? Is- SS« 

ss&b sot 1 S s Iks ssssss 

tgcaatctta tcagaagctg ^gcaaaattt «9gy att tggtcag tttagaggat 780 
qcgagagcca gtgcagccgc agtcctggca 99"a"«H ccac ^ t gaaa aagctctcca 840 
Willi Tc? | ,c cac ? |c ccc jgegtjg | 

safe sssu I si sra asssa 

acaatgctaa tgtcactgga gaaatagaat "9«a« aagaa aagaagaaaa 1260 

ctttagaaat atgcatcaag gcctgtaaga ^ccttgccr ggaa agcgca 1320 

agtgcaatcc gtatgtgaag acctacctgt tgcccga y t aagtatcagg 1380 

aaactggagt ccaaaggaac accgtggacc cgacmi. ^tgtggcat ctgggcacgc 1440 
?ggcccctgc ccagctggtg acccggcagc tgcaggtctc ggtg gg tttgaag 1500 

felt! SSSSS | W »? SSfSf 

fas s» i iii sssss ssasa 

tcctgaggaa gcaggcttgc ccccagtgga ^acactcax y ctctttggaa 1920 

ggcgtgaggc actgtgcgtc tgcagagggg ^acga c £ C atggctt aaccgcctat 2220 
gacccctttg accttgagca gtctccatct gcggccc g tct ggtgggttat 2280 

Ifcfcftfg JSSSS fatgg-ga Staitaaa cgttgttacc catgaaaaaa 2340 



aaaaaaa 



<210> 43 
<211> 610 
<212> PRT 

<213> Homo sapiens 



SjSVtfn «1« He Asp Leu ser Ala Leu Lys «1u Leu «lu Arg Glu 



1 5 
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A la He Leu Gin val Leu Tyr Arg Asp Gin Ala val Gin Asn Thr Glu 

20 " 
G lu Glu Arg Thr Arg Lys Leu Lvs Thr His Leu Gin His Leu Arg Trp 



35 



Lys Gly Ala Lys Asn Thr Asp Trp Glu His Lys Glu Lys cys cys Ala 

50 " 
Arg cys Gin Gin val Leu Gly Phe Leu Leu His Arg Gly Ala val cys 



65 70 



Arg Gly cys ser His Arg val Cys Ala Gin cys Arg val Phe Leu Arg 

85 

Gly Thr His Ala Trp Lys Cys Thr val cys Phe Glu Asp Arg Asn val 



100 



Lys lie Lys Thr Gly Glu Trp Phe Tyr Glu Glu Arg Ala Lys Lys Phe 

115 iZU 



Pro Thr Gly Gly Lys His Glu Thr val Gly Gly Gin Leu Leu Gin ser 

130 15b 



T yr Gin Lys Leu ser Lys He Ser val val Pro Pro Thr Pro Pro Pro 



145 



val ser Glu Ser Gin Cys Ser Arg ser Pro Gly Arg Leu Gin Glu Phe 



165 



Gly G ln Phe Arg Gly Phe Asn Lys Ser val Glu Asn Leu Phe Leu ser 

180 

Leu Ala Thr His val Lys Lys Leu ser Lys ser Gin Asn Asp Met Thr 

ser Glu Lys His Leu Leu Ala Thr Gly Pro Arg Gin cys val Gly Gin 

210 

Thr Glu Arg Arg Ser Gin ser Asp Thr Ala val Asn val Thr Thr Arg 



225 230 



Lys val ser Ala Pro Asp lie Leu Lys Pro Leu Asn Gin Glu Asp Pro 
Lys cys ser Thr Asn Pro He Leu Lys Gin Gin Asn Leu Pro Ser Ser 



260 265 
Pro Ala Pro ser Thr He Phe Ser Gly Gly Phe Arg His Gly ser Leu 

275 280 
lie ser He Asp Ser Thr cys Thr Glu Met Gly Asn Phe Asp Asn Ala 
290 295 



Asn val Thr Gly Glu lie Glu Phe Ala He His Tyr Cys Phe Lys Thr 
305 310 

His ser Leu Glu lie cys He Lys Ala Cys Lys Asn Leu Ala Tyr Gly 

Glu Glu Lys Lys Lys Lys Cys Asn Pro Tyr Val Lys Thr Tyr Leu Leu 

340 3H3 
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riw iv<= Ara lvs Thr Gly val Gin Arg Asn 
Pro Asp Arg ser ser Gin Gly Lys Arg Lys y 

355 360 
Thr val Asp Pro Thr Phe sin Glu Thr Leu Lys Tyr Gin val Ala Pro 



375 

Ala Gin Leu val Thr Arg Gin Leu Gin val ser val Trp His Leu gTj 
385 390 

Thr Leu Ala Arg Arg va! Phe Leu sly Glu val He He Pro Leu Ala 

405 HXKJ 
Thr Trp Asp Phe slu Asp ser Thr Thr sin ser Phe Arg Trg His Pro 

420 

Leu Arg Ala Lys Ala slu Lys Tyr slu Asp ser val Pro sin ser Asn 

435 440 
Sly Slu Leu Thr val Arg Ala Lys Leu val Leu Pro ser Arg Pro Arg 

450 4bb 
Lys L eu Gin Glu Ala Gin Glu Gly Thr Asp Gin Pro Ser Leu His Gly 
465 470 

Gin Leu cys Leu val val Leu sly Ala Lys Asn Leu Pro val Ara. Pro 

485 H:7U 
ASP sly Thr Leu Asn ser Phe val Lys sly cys Leu Thr Leu Pro Asp 

500 ^ 
Gin Gin Lys Leu Arg Leu Lys Ser Pro val Leu Arg Lys sin Ala cys 

515 5ZU 
Pro Gin Trp Lys His ser Phe val Phe ser Gly val Thr Pro Ala sin 

530 535 
Leu Arg sin Ser ser Leu Glu Leu Thr val Tr ? Asp Gin Ala Leu Phe 
545 550 

cly M et Asn Asp Arg Leu Leu Gly Gly Thr Arg Leu Gly Ser Lys Gly 

565 

Asp Thr Ala val Gly Gly Asp Ala cys ser Gin ser Lys Leu Gin Trp 

580 

sin Lys val Leu ser ser Pro «n Leu Trp Thr Asp Met Thr Leu val 

Leu His 
610 

<210> 44 
<211> 1648 
<212> DNA 

<213> Homo sapiens 

gaattcatgc ccctcgtaga gcagcaggtc caagcagg|c tgctggctat ttttcegj. ^ 

SSSSSS ESSE ?| HI «fi 33«3S |jj 



stOOOSseq 



atcccaaatg ctctactaac cctattttg. agcaacagaa tctcccjtcc agtccggc^c 360 
ccagtaccat attctctgga ggttttagac acggaagttt |«xag ? ttg 2cattc 480 
gtacagagat gggcaatttt gacaatgcta atgtactgg ajaaatag ^ cct 540 
attattgctt caaaacccat tctttagaaa jatgcaxcaa yy | * ttgcccgaca 600 
atggagaaga aaagaagaaa aagtgcaatc cgtatgtgaa gaccx y y acctttc 660 
gatcctccca gggaaagcgc aagactggag jccaaaggaa jaccgiyy ct * caggtct 720 
aggagacctt gaagtatcag gtggcccctg "cagctggt gacccgg g M at »» ctc n0 

SS85 £2832 SffiSS a jgggffi 88 

assss ssssss fa 3^ g^™ -o 

alctgagact gaagtcgcca gtcctgagga agcaggcttg "cccag yy 1200 
ttgtcttcag tggcgtaacc ccagctcagc tgaggcagtc ggTTggy cttggt tcaa 1260 
gggatcaggc cctctttgga atgaacgacc 9"tgcttgg gg|aaccag 1320 
Igggagacac agctgttggc ggggatgcat gctcacaatc gaagcxc y yy | 1380 
tcctttccag ccccaatcta tggacagaca tgactcttgt cctgcacxg y yy^ ^ 
tcaaggttcc aggttgcagc aggcgtgagg "Ctgtgcgt ctgcag yyy y * 1500 
aggtgcaggg tcccagctgg agaccccttt gaccttgagc agtctc ^hh ^ 
SSSJSS JSSSSSI JSPcSS? SSSSU aaatggccag attttaataa 1620 
acgttgttac ccatgaaaaa aaaaaaaa 



<210> 45 
<211> 313 
<212> PRT 

<213> Homo sapiens 

<400> 45 
Met 



GlJ'Asn Phe Asp Asn Ala Asn val Thr Gly Glu He Glu Phe Ala 
He His Tyr Cys Phe Lys Thr His ser Leu Glu He Cys lie Lys Ala 



20 

cys Lys Asn Leu Ala Tyr Gly Glu Glu Lys Lys Lys Lys cys Asn p 



ro 



35 40 



T yr val Lys Thr Tyr Leu Leu Pro Asp Arg Ser Ser Gin Gly Lys Arg 

Lys Thr Gly Val Gin Arg Asn Thr val Asp Pro Thr Phe Gin Glu Thr 
65 70 

Leu Lys Tyr Gin val Ala Pro Ala Gin Leu val Thr Arg Gin Leu Gin 

val ser val Trp His Leu Gly Thr Leu Ala Arg Arg val Phe Leu Gly 
100 105 

Glu val He He Pro Leu Ala Thr Trp Asp Phe Glu Asp ser Thr Thr 

215 120 XiL -' 

Gin ser Phe Arg Trp His Pro Leu Arg Ala Lys Ala Glu Lys Tyr Glu 

130 135 
Asp ser val Pro Gin ser Asn Gly Glu Leu Thr val Arg Ala Lys Leu 

145 150 X - ,J 

val Leu Pro ser Arg Pro Arg Lys Leu Gin Glu Ala Gin Glu Gly Thr 
165 17 °Page 19 
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Asp Gin Pro ser Leu His Gly Gin Leu cys Leu val val Leu Gly Ala 
180 !85 ±yu 

Lys Asn Leu Pro val Arg Pro Asp Gly Thr Leu Asn Ser Phe val Lys 

195 200 zu:> 

Gly cys Leu Thr Leu Pro Asp Gin Gin Lys Leu Arg Leu Lys Ser Pro 

2io 215 " u 

val Leu Arg Lys Gin Ala cys Pro Gin Trp Lys His Ser Phe val Phe 

225 230 235 ^ 

ser Gly val Thr pro Ala Gin Leu Arg Gin Ser Ser Leu Glu Leu Thr 
245 250 ^ 

val Trp Asp Gin Ala Leu Phe Gly Met Asn Asp Arg Leu Leu Gly Gly 
260 265 

Thr Arg Leu Gly ser Lys Gly Asp Thr Ala Val Gly Gly Asp Ala Cys 

275 280 

ser Gin ser Lys Leu Gin Trp Gin Lys val Leu Ser Ser Pro Asn Leu 

290 295 ^uu 

ttd Thr Asp Met Thr Leu val Leu His 
305 310 

<210> 46 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

<223> Description of the artificial sequence oligonucleotide 

<400> 46 21 
tcgtagagca gcaggtccaa g 
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